M208 


Pure mathematics 


Book A 
Introduction 


The Open 
University 


This publication forms part of an Open University module. Details of this and other Open University modules 
can be obtained from Student Recruitment, The Open University, PO Box 197, Milton Keynes MK7 6BJ, 
United Kingdom (tel. +44 (0)300 303 5303; email general-enquiries@open.ac.uk). 


Alternatively, you may visit the Open University website at www.open.ac.uk where you can learn more about 
the wide range of modules and packs offered at all levels by The Open University. 


The Open University, Walton Hall, Milton Keynes, MK7 6AA. 
First published 2018. 
Copyright © 2018 The Open University 


All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, transmitted or 
utilised in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without 
written permission from the publisher or a licence from the Copyright Licensing Agency Ltd. Details of such 
licences (for reprographic reproduction) may be obtained from the Copyright Licensing Agency Ltd, Barnard’s 
Inn, 86 Fetter Lane, London EC4A 1EN (website www.cla.co.uk). 


Open University materials may also be made available in electronic formats for use by students of the 
University. All rights, including copyright and related rights and database rights, in electronic materials and 
their contents are owned by or licensed to The Open University, or otherwise used by The Open University as 
permitted by applicable law. 


In using electronic materials and their contents you agree that your use will be solely for the purposes of 
following an Open University course of study or otherwise as licensed by The Open University or its assigns. 


Except as permitted above you undertake not to copy, store in any medium (including electronic storage or use 
in a website), distribute, transmit or retransmit, broadcast, modify or show in public such electronic materials in 
whole or in part without the prior written consent of The Open University or in accordance with the Copyright, 
Designs and Patents Act 1988. 


Edited, designed and typeset by The Open University, using PTẸEX. 
Printed in the United Kingdom by Hobbs the Printers Limited, Brunel Road, Totton, Hampshire, S040 3WX. 


ISBN 978 1 4730 2266 9 
2.1 


Contents 


Unit A1 Sets, functions and vectors 
Introduction to Book A 
Introduction 


1 Points, lines and distance 
1.1 The plane 
1.2 Three-dimensional space 


2 Sets 
2.1 What is a set? 
2.2 Set notation 
2.3 Intervals 
2.4 Plane sets 
2.5 Set equality and subsets 
2.6 Set operations 


3 Functions 
3.1 What is a function? 
3.2 Image set of a function 
3.3 Inverse functions 
3.4 Composite functions 


4 Vectors 
4.1 What is a vector? 
4.2 Components and the arithmetic of vectors 
4.3 Vector form of the equation of a line 
4.4 Scalar product 
4.5 Equation of a plane in RÌ 


Summary 
Learning outcomes 


Solutions to exercises 


Unit A2 Number systems 


Introduction 


1 


Real numbers 


1.1 
1:2 
1.3 
1.4 


Standard subsets of the real numbers 
Arithmetic of real numbers 

Solutions of polynomial equations 
The Factor Theorem 


2 Complex numbers 

2.1 What is a complex number? 

2.2 Arithmetic of complex numbers 

2.3 Polar form 

2.4 Complex roots of polynomials 

2.5 The complex exponential function 

2.6 Summary: Cartesian, polar and exponential form 
3 Modular arithmetic 

3.1 The Division Theorem 

3.2 Congruence 

3.3 Operations in Z, 

3.4 Multiplicative inverses in Zn 

3.5 Solving linear equations in Zn 
Summary 


Learning outcomes 


Solutions to exercises 


101 


103 


103 
103 
106 
109 
111 


116 
116 
120 
128 
136 
146 
149 


151 
151 
152 
157 
161 
170 


175 


175 


176 


Unit A3 Mathematical language and proof 
Introduction 


1 Mathematical statements 
1.1 Statements and negations 
1.2 Conjunctions and disjunctions 
1.3 Implications 
1.4 Equivalences 
1.5 Universal and existential statements 


2 Direct proof 
2.1 Proving implications 
2.2 Proving equivalences 
2.3 Proving existential and universal statements 
2.4 Counterexamples 
2.5 Proof by induction 


3 Indirect proof 
3.1 Proof by contradiction 
3.2 Proof by contraposition 


4 Equivalence relations 
4.1 What is an equivalence relation? 
4.2 Equivalence classes 


Summary 
Learning outcomes 


Solutions to exercises 


189 


191 


191 
191 
194 
197 
203 
206 


209 
211 
217 
223 
224 
226 


235 
236 
241 


245 
246 
255 


267 


267 


268 


Unit A4 Real functions, graphs and conics 


Introduction 


1 


Real functions and their graphs 

1.1 Real functions 

1.2 Graphs of basic functions 

1.3 Translations and scalings of graphs 


2 Graph sketching 
2.1 Determining features of a graph 
2.2 Strategy for graph sketching 
3 New graphs from old 
3.1 Further graph-sketching techniques 
3.2 Hybrid functions 
4 Hyperbolic functions 
4.1 Properties of hyperbolic functions 
4.2 Graphs of hyperbolic functions 
5 Conics 
5.1 Conic sections 
5.2 Conics in standard position 
5.3 General equation of a conic 
5.4 Parametrising conics 
5.5 Summary: some standard parametrisations 
Summary 


Learning outcomes 


Solutions to exercises 


Acknowledgements 


Index 


279 


281 


281 
281 
283 
293 


299 
301 
314 


323 
324 
330 


332 
332 
334 


338 
338 
342 
345 
348 
356 


357 


357 


358 


369 


371 


Unit Al 
Sets, functions and vectors 


Introduction 


Introduction to Book A 


M208 covers a wide range of pure mathematics, and each book apart from 
this one concentrates on one topic. This book is different, because it covers 
the main concepts that underlie the topics in the other books. 


In Unit Al you will review some of the important foundations of pure 
mathematics and the mathematical language used to describe them. You 
will start with the plane, and revise ideas relating to points, lines and 
circles. You will then study in detail the mathematical ideas of a set 
(mostly of numbers or of points in the plane), and a function, including 
functions of real numbers and functions of points in the plane. Finally, you 
will consider vectors in the plane and in three-dimensional space. 


In Unit A2 you will look at number systems and their properties. You will 
first consider real numbers, and sets of real numbers, such as the integers 
and the rational numbers, then study complex numbers, investigate their 
properties, and look at some functions of complex numbers. Finally, you 
will study modular arithmetic, which provides examples of finite number 
systems. 


In Unit A3 you will concentrate on mathematical language and 
communication. You will study the important subject of mathematical 
proof, including the use of different methods of proof, and how to disprove 
a statement by finding a counterexample. You will also consider errors in 
mathematical arguments including errors in deduction. Finally, you will 
study equivalence relations and the idea of a partition of a set. 


In Unit A4 you will concentrate on real functions, and on how to draw 
their graphs. You will review the graphs of various common functions, and 
consider a wide range of functions and their properties, including 
trigonometric and hyperbolic functions. Finally, you will consider curves 
that are not the graphs of real functions including conics (circles, 
parabolas, hyperbolas and ellipses) and see that they can be described in 
terms of a single parameter. 


Introduction 


In this unit you will look at some of the most fundamental mathematical 
concepts underlying pure mathematics. Many of these concepts should not 
be new to you, but working through this unit should ensure that you 
understand them to the level needed for M208. 


Sections 1 to 3 contain basic material that will be crucial throughout the 
module. It is vital that you become familiar and confident with the ideas 
and notation introduced in these sections. Section 4 revises concepts that 
will be used later in the module, in particular in Book C Linear algebra. 


Unit A1 Sets, functions and vectors 


Figure 1 The real line 


YA 
(a,b) 


A 
p= 
| 
| 
| 
| 
| 
a 


> 
ax 


O 


Figure 2 Cartesian 
coordinates 


René Descartes 


Rv 


1 Points, lines and distance 


In this section you will revise points, lines and distance in two- and 
three-dimensional space. 


1.1 The plane 


The set of all real numbers is denoted by R, and this set can be pictured as 
an infinitely long number line, often called the real line, as shown in 
Figure 1. Each real number a corresponds to a point on the line. 


In this subsection we consider the plane, or two-dimensional space. To 
allow us to specify the locations of points in the plane, we usually use a 
pair of perpendicular axes, known as Cartesian or rectangular axes. We 
usually label the axes x and y; we refer to their intersection point as the 
origin and sometimes label it O. Finally, we choose a unit of distance. 
The location of any point in the plane can be specified by using an 
ordered pair (a,b) of real numbers, known as Cartesian coordinates or 
just coordinates, that give the position of the point relative to the axes, 
as shown in Figure 2. (An ordered pair is a pair in which order matters; 
for example, the ordered pair (2,3) is different from the ordered pair 

(3, 2).) We write A(a,b) to specify the point A with coordinates (a, b). 

It is important to understand that the coordinates of a point depend on 
where the axes have been placed in the plane; if we had chosen the axes to 
be in a different position, then usually the coordinates of the point would 
be different. However, once we have chosen the position of the axes, we 
often do not bother to distinguish explicitly between a point and its 
representation using these coordinates: we simply write (a,b) to denote the 
point A. 


We use the notation R? to denote the plane. 


The adjective Cartesian comes from the surname of the French 
mathematician and philosopher René Descartes (1596-1650). He was 
the first person to show in print how algebra could be used to study 
geometry, in his 1637 publication La géométrie. Descartes’ procedure 
differed from the system of Cartesian coordinates that we use today. 
His axes were not necessarily at right angles, and could be chosen in 
relation to the circumstances of the problem rather than being given 
in advance. 


The plane, together with an origin O and a pair of x- and y-axes, is 
known as two-dimensional Euclidean space. 


Euclidean space is named after the Greek mathematician Euclid. 
Little is known for certain about Euclid but he is believed to have 
worked in Alexandria in around 300 BCE. 


Euclid’s Elements, a mathematical treatise of thirteen books which 
had its origins on papyrus rolls, has become one of the most frequently 
printed texts of all time. Although Elements covers both plane and 
solid Euclidean geometry, Euclid had no notion of axes or coordinates. 


Lines 
The equation of any straight line in R?, except a line parallel to the y-axis, 
can be written in the form 
yY=mMr+C, 
where m,c E R. 
In this equation: 
e mis the gradient (or slope) of the line, given by 
y2 — Yi 
= =—, (1) 
T2 — T1 
where (x1, y1) and (x2, y2) are any two points on the line such that 
T1 É T2 
e cis the y-intercept of the line; that is, (0, c) is the point at which the 
line crosses the y-axis, as illustrated in Figure 3(a). 


m 


The line with gradient m that crosses the y-axis at the origin has equation 
y = mz, since c = 0 in this case; see Figure 3(b). The horizontal line 
(parallel to the x-axis) with y-intercept c has equation y = c, since the 
gradient m = 0 in this case; see Figure 3(c). 


The equation of a line parallel to the y-axis cannot be written in the 
form (1). The vertical line (parallel to the y-axis) with x-intercept a has 
equation x = a; see Figure 3(d). The equation of such a line cannot be 
written in the form y = ma + c because the gradient is undefined. 


y 


Y = mr 


Xy 


(a) (b) 


Figure 3 Lines in the plane 


In all of the cases above, the equation of the line in the plane can be 
rearranged to take the form 


ax + by =c, (2) 
for some real numbers a, b and c, where a and b are not both zero. (Note 


that the numbers a and c here are not the same as those called a and c in 
Figure 3.) 


1 Points, lines and distance 


Unit A1 Sets, functions and vectors 


y=-—2r-3 


Figure 4 Parallel and 
perpendicular lines 


In fact, any line in R? has an equation of the form (2) and, conversely, any 
equation of the form (2) represents a line in R?. 


Equation of a line 


The general equation of a line in R? is 
ax + by =c, 


where a, b and c are real numbers, and a and b are not both zero. 


From formula (1) for the gradient of a line, we can see that the equation of 
the line with gradient m that passes through the point (21, y1) is 


y= yi = m(r— 21). 


Exercise Al 


Determine the equation of the line with gradient —3 that passes through 
the point (2, —1). 


Exercise A2 


Determine the equation of the line through each of the following pairs of 
points. 


(a) (1,1) and (3,5) (b) (0,0) and (0,8) (c) (0,0) and (4, 2) 
(d) (4,—1) and (2, —1) 


Parallel and perpendicular lines 


Two distinct lines are parallel if they never meet, and perpendicular if 
they meet at right angles. 


Saying that two non-vertical lines are parallel is equivalent to saying that 
they have the same gradient but different y-intercepts. For example, as 
shown in Figure 4, the lines y = —2x + 7 and y = —2x — 8 are parallel 
since they both have gradient —2 but their y-intercepts are 7 and —3, 
respectively, whereas the lines y = —2x” + 7 and y = 2x — 3 are not parallel 
since their gradients —2 and 2 are not equal. 


We can also use the gradients of a pair of non-vertical lines to check 
whether they are perpendicular, as follows. 


Gradients of perpendicular lines 
Let lı and lg be lines with gradients mı and mz, respectively. 
e Ifl; and l2 are perpendicular, then mim = —1. 


e If mymg = —1, then lı and l are perpendicular. 


To see that the first statement in the box is true, suppose that the lines l 
and lz are perpendicular and that neither line is vertical. Let the gradients 
of lı and lz be mı and mg, respectively. Then one of the lines (1), say) 
must slope up from left to right and the other (l2, say) must slope down 
from left to right, as shown in Figure 5. 


Figure 5 Perpendicular lines 


Let the lines intersect at P, and let Q be a point on lı to the right of P. 
Suppose that Q is a units to the right of P and b units up from P, as 
illustrated in Figure 5. Let R be the point on lə obtained by rotating PQ 
anticlockwise through a right angle; then R is b units to the left of P and a 
units up from P, as shown. 


It follows that the gradient of lı is mı = b/a, and the gradient of lə is 
ma = —a/b. Hence 


mım = ? x (-<) =, 


The proof of the second statement in the box above is not given here. 


Worked Exercise A1 


Determine which of the following lines are parallel, and which are 
perpendicular to each other. 
l: y =—2r +4 Ig: 2r -38y-2=0 klz: y—2r=9 
l4: 2y+3r+5=0 b: a+ sy+2=0 Ig: 2y = 3r +7 


1 Points, lines and distance 


Unit A1 Sets, functions and vectors 


Exercise A3 
Determine which of the following lines are parallel, and which are 
perpendicular to each other. 


l: y=—2r +4 lo: 6r —3y+4=0 lz: 2y+a=10 
l4: 6y-—3@+5=0 I5: x-2y+2=0 le: 2y +4r+7=0 


Distance between two points in the plane 


Next, we find the formula for the distance between any two points in the 
plane. 


We use the idea of the modulus of a real number k, written |k| and 
defined by 


A = k, ifk>0, 
© |=-k, ifk<0. 


(The modulus of k, usually read as ‘mod k’ is sometimes called the 
absolute value or magnitude of k.) 


Suppose that P(x1,y1) and Q(x2, y2) are two points in the plane, as shown 
in Figure 6. We can construct a right-angled triangle PNQ as shown: the 
line PN is parallel to the z-axis, the line QN is parallel to the y-axis, the 
angle PNQ is a right angle, and PQ is the hypotenuse of the triangle. In 
Figure 6, P and Q are drawn in the first quadrant and with PQ sloping up 
from left to right, but the formula holds wherever the points are in the 
plane. 


al Q(£2, y2) 


N (22, y1) 
P(2x1,91) 


T 
Figure 6 Distance between P and Q in the plane 


The length of PN is |x2 — xı| and the length of QN is |y2 — yıļ. It follows 
from Pythagoras’ Theorem that 


PQ? = PN? + QN?, 
and since |k|? = k? for any real number k, we have 


PQ = y (z2 — z1}? + (y2 — y1}. 


Distance formula for R? 


The distance between the two points (x1, yi) and (x2, y2) in the plane 
is 


(eo 21)” = Qa = gm) 


For example, it follows from the formula above that the distance between 
the points (1,2) and (3, —4) is 


(3—1)2 + (—4— 2)2 = ,/22 + (6)? 


= /40 = V4 x 10 
= 4710 = 2/10. 


Exercise A4 
Find the distances between the following pairs of points in the plane. 
(a) (0,0) and (5,0) (b) (0,0) and (3,4) (c) (1,2) and (5,1) 
(d) (3,—8) and (—1,4) 


Circles 


A circle in R?, as illustrated in Figure 7, is the set of points P(x, y) that 
lie at a fixed distance r, called the radius, from a fixed point C (a,b), 
called the centre of the circle. 


YA 
P(x,y) 


> 
x 


Figure 7 A circle with radius r and centre (a, b) 


By the distance formula, every point (x,y) on the circle with centre (a,b) 
and radius r satisfies the equation 


V (a — a)? + (y—b)? =r. 


Squaring this equation to remove the square root gives the following. 


Equation of a circle 


The equation of the circle in R? with centre (a,b) and radius r is 
(z-a) + (y -b =r. 


1 Points, lines and distance 


Unit A1 Sets, functions and vectors 


10 


In this unit we will just work with equations of circles in this form, 
without multiplying out the brackets. In Unit A4 Real functions, graphs 
and conics, you will see how multiplying out the brackets leads to other 
forms for the equations of circles. 


Worked Exercise A2 


Find the equation of the circle with centre (—1,2) and radius V3. 


Exercise A5 


Determine the equation of each of the following circles, given the centre 
and radius. 


(a) Centre the origin, radius 4. 
(b) Centre (—1,0), radius V2. 
(c) Centre (3,—4), radius 2. 


1.2 Three-dimensional space 
We now look briefly at three-dimensional space. 


We define a coordinate system in three-dimensional space using three 
mutually perpendicular axes. The word mutually here means that the 
condition holds for any pair, so mutually perpendicular means that any 
two of the axes are perpendicular. 


First, we choose a point O as the origin, and then we choose an x-axis and 
a y-axis at right angles to each other. Next, we draw a third line through 
the origin, perpendicular both to the x-axis and to the y-axis; this line is 
called the z-axis. We choose the positive direction of the z-axis to be such 
that the x-, y- and z-axes form a so-called right-handed system of axes. 
This means that if you hold the thumb and first and second fingers of your 
right hand at right angles to each other, and label them x, y and z, in that 
order, then you can turn your hand in such a way that your fingers point 
in the positive directions of the corresponding axes, as shown in Figure 8. 


Figure 8 A right-handed system of coordinate axes for R° 


Finally, we choose a unit of distance. 


We represent each point in three-dimensional space by an ordered triple 
(a,b,c) of real numbers. The point with coordinates (a, b,c) is reached 
from the origin by moving a distance a in the direction of the x-axis, a 
distance b in the direction of the y-axis, and a distance c in the direction of 
the z-axis, as illustrated in Figure 9(a). 


For instance, the point with coordinates (—3, —2, 4) is shown in 
Figure 9(b). 


Figure 9 Three-dimensional Cartesian coordinates 


In Figure 9, the plane containing the z-axis and the y-axis is shaded. 
Usually we think of this plane as being horizontal, and the z-axis as being 
vertical. 


We use the notation R? to denote three-dimensional space. 
Exercise A6 


Sketch the z-, y- and z-axes and the points with coordinates (0, 1,2) and 
(1,91). 


1 Points, lines and distance 


11 


Unit A1 Sets, functions and vectors 


12 


As with R?, once we have chosen the position of the axes, we often do not 
bother to distinguish explicitly between a point and its representation 
using these coordinates; we simply write (a,b,c) to denote the point in R3 
represented by this triple. 

Three-dimensional space, together with an origin and a set of x-, y- and 
z-axes, is known as three-dimensional Euclidean space. 


Distance between points in R 


You saw in Subsection 1.1 that the distance between two points (21, y1) 
and (#2, y2) in the plane is given by 


y (x2 = z1)? + (y2 = y1}. 


We can establish a similar formula for the distance between two points 
in R3, as follows. 


Let P(x1,y1, 21) and Q(2, Y2, 22) be two points in RÌ. Let M be the point 
(x2, y2, 21); then M lies in the same horizontal plane as P, and MQ is 
parallel to the z-axis. Next, let N be the point (21, y2, 21); then N also lies 
in the same horizontal plane as P, and MN and NP are parallel to the zx- 
and y-axes, respectively. 


The triangles PQM and PMN are both right-angled triangles, with right 
angles at M and N, respectively, as shown in Figure 10. 


ZA 


Q(x2, Y2, z2) 


FRc iyi | 


N(21, yo, 21) 


T M (22, Y2, z1) 
Figure 10 Distance between P and Q in R 


The length of PN is |y2 — yı| and the length of NM is |x2 — x1|. It follows 
from Pythagoras’ Theorem that 
PM? = NM? + PN’, 
so 
PM? = (22 — 21)? + (y2 — y1}. 
Using Pythagoras’ Theorem again gives 
PQ? = PM? + MQ’, 
and since the length of MQ is |z2 — z1| we obtain 
PQ? = (#2 — 21)? + (yo — 1)? + (22 - 21)’, 
that is, 


PQ = y (#2 — £1)? + (y2 — y1)? + (22 — 21). 


Distance formula for R3 
The distance between the two points (x1, y1, 21) and (2, y2, z2) in R3 
is 


(= a21)” FE) = m) en a 


For example, it follows from this formula that the distance between the 
points (1,2,3) and (4, —2, 15) is 


— 1)? + (—2— 2)? + (15 — 3)? = V169 = 13. 


Exercise A7 


Find the distances between the following pairs of points in R. 
(a) (1,1,1) and (4, 1, —3) (b) (1, 2, 3) and (3, 0,3) 


We will return to the topic of three-dimensional space in Section 4, where 
we will consider vectors in R? as well as in R?, and find the general 
equation of a plane in R3. 


2 Sets 


In this section you will revise the notion of sets, learn new notation for 
describing sets, and practise working with sets and set notation. These 
skills will be crucial in the rest of the module. 


2.1 What is a set? 


In mathematics we frequently work with collections of objects of various 
kinds. We may, for example, consider the following: 


e solutions of a quadratic equation 
e points on a circle 

e vertices of a triangle 

e points on a plane in RÌ 

e even numbers less than 100 


e students taking a particular examination. 


The concept of a set allows us to work with such collections systematically. 


You can think of a set as a collection of objects, such as numbers, points, 
functions, or even a collection of other sets. Each object in a set is an 
element or member of the set, and the elements belong to the set, or are 
in the set. 


2 Sets 


13 


Unit A1 Sets, functions and vectors 


el 


Figure 11 A Venn diagram of 
the set S 


14 


There is no restriction on the types of object that may appear in a set, 
provided that the set is specified in a way that enables us to decide, in 
principle, whether a given object is in the set. 


There are many ways of making such a specification. For example, we can 
define S to be the set of numbers in the list 
4, 9, 3, 2. 


This enables us to decide that the number 2 (say) is in S, but that the 
number 1 (say) is not in S. We can illustrate this set by a diagram, as in 
Figure 11, where the symbol S is not a member of the set but a label for 
it. (Similar labels will appear in other diagrams.) Such a diagram is called 
a Venn diagram, after the nineteenth-century Cambridge mathematician 
John Venn. 


We can also define a set by describing its elements; for example, 
let E be the set of all even integers. 


This description enables us to determine whether a given object is in E by 
deciding whether it is an even integer; for example, 6 is in EF, but 5 is not. 


Some sets are used so often that special symbols are reserved for them. 
Recall that a real number is a number with a decimal expansion (possibly 
infinite), for example, 1.1 or m = 3.14..., and a rational number is a real 
number that can be expressed as a fraction, for example, 14/5 or —3/4. 
You will revise these sets more thoroughly in Unit A2 Number systems. 
We use the following notation, some of which you met in Section 1. 


R denotes the set of real numbers. 

IR* denotes the set of non-zero real numbers. 

Q denotes the set of rational numbers. 

Z denotes the set of integers ...,—2,—1,0,1,2,.... 
N denotes the set of natural numbers 1, 2,3,.... 


A finite set is a set that has a finite number of elements; that is, the 
number of elements is some natural number, or 0. Any set that is not a 
finite set is an infinite set. 


We use the symbol € to indicate membership of a set; for example, we 
indicate that 7 is a member of N by writing 


TEN. (This is usually read as ‘7 belongs to N’ or ‘7 is in N’.) 
We indicate that —9 is not a member of N by writing 
—9 EN. (‘—9 does not belong to ÑN’ or ‘—9 is not in N’.) 


We also use the symbol € when we wish to introduce a symbol that stands 
for an arbitrary (that is, general, unspecified) element of a set. For 
example, to indicate that x is a real variable, that is, an arbitrary 
member of the set R, we write 


let x ER. 


2 Sets 
We often write 71,22 E S as shorthand to combine zı € S and rg E€ S. 


Exercise A8 


Which of the following statements are true? 
(a) -3E€Z b) 5¢N (c) 13¢Q (da) 1,3€Q 
(e) -reER (f) EN (g) O01ER* (h) V2ER 


2.2 Set notation 


We now look at some formal ways of specifying a set. 


We can specify a set with a small number of elements by listing these 
elements between a pair of braces (curly brackets). For example, we can 
specify the set A consisting of the first five natural numbers, illustrated in 
Figure 12, by 

A = {1,2,3,4, 5}. 
The membership of a set is not affected by the order in which its elements Figure 12 The set A 
are listed, so we can specify this set A equally well by 


A = {5,2,1,4,3}. 


y 


Similarly, we can specify the set B of vertices of the square shown in 
Figure 13 by 


B = {(0,0), (1,0), (1, 1), (0, 1)}. 


We can even specify a set C, illustrated in Figure 14, whose elements are 
the three sets {1,3,5}, {9,4} and {2} by (0,0) (1,0) 7 


C= {{1, 3, 5}, {9, 4}, {2}}. Figure 13 The set B 


A set with only one element, such as the set {2}, is called a singleton or a 
singleton set. (Do not confuse the set {2} which contains the number 2, 


with the number 2 itself.) 
Exercise A9 C> 


Which of the following statements are true? 


a) 1€ {4,3,1,7} Figure 14 The set C 
b) {-9} € {{6, 1,2}, {8,7,9,5}, {9}, {5, 4}} 

c) {9} € {5,6,7,8,9} 

d) (0,1) € {(1,0), (1,4), (2, 4)} 

) 1,0 € {(1,0), (1,4), (2,4)} 

) {1,0} € {{0, 1}, {1,4}, {2, 4}} 


e 


( 
( 
( 
( 
( 
(£ 


15 


Unit A1 Sets, functions and vectors 


16 


It does not matter if we specify a set element more than once within set 
brackets. For example, 


{1,2,3,3} and {1,2,3} 
describe the same set. However, we usually try to avoid specifying an 
element more than once. 


For a set with a large number of elements, it is not practical to list all the 
elements, so we sometimes use three dots (called an ellipsis) to indicate 
that a particular pattern of membership continues. For example, we can 
specify the set consisting of the first 100 natural numbers by writing 


{1,2,3,... , 100}. 


The use of an ellipsis can be extended to certain infinite sets. For example, 
we can specify the set of all natural numbers by writing 


(19S ast, 


One disadvantage of this notation is that the pattern indicated by the 
ellipsis may be ambiguous. For example, it is not clear whether 


{3,5,7,...} 


denotes the set of odd prime numbers or the set of odd natural numbers 
greater than 1. For this reason, this notation can be used only when the 
pattern of membership is obvious, or where an additional clarifying 
explanation is given. 


An alternative way of specifying a set is to use variables to build up 
objects of the required type, and then write down the condition(s) that the 
variables must satisfy. For example, consider the set of all real numbers x 
such that x > 3. Using set notation, we write this as 


{x ER: 2 > 3}, 


which is read as shown in Figure 15. 


{| |ceER : o>; 


Figure 15 How to ‘read’ set notation 


A set can often be described in several different ways using such set 
notation. In particular, we can use a letter other than x to denote an 
arbitrary (general) element of a set; for example, the set above can also be 
written as 


{rE R:r> 3}. 


If it is necessary to include more than one condition after the colon, then 
we write either a comma or the word ‘and’ between the conditions. So the 


set of real numbers greater than 0, and less than or equal to 1, can be 
written as 


{fER:r2>0,x<1} or {xrER:z>0andz<1}, 
although usually we combine the inequalities and write 
{cER:0<2< 1}. 


Sometimes it is convenient to specify a set by writing an expression in one 
or more variables before the colon, and the conditions on the variables 
after the colon. For example, the set of even integers less than 100 may be 
specified by 


{2k : k € Zand k < 50}. 


Just as when listing the elements of a set, it does not matter when using 
set notation if a set element is specified more than once. For example, 


{sinz :x € R} 
specifies the same set as 


{sing :0 < x < 2r}. 


Exercise A10 


Which of the following statements are true? 

(a) 3e{rER:1>3} (b) TE {3kK4+1:k eZ} 

(c) —fe{xeZ:2<5} (da) 8€ {2 :xER, 0<2< 2} 

(e) 9E {n€ Z:n = k? for some k € Z} (f) 6€ {m(m—1):meEN} 
(g) 


g) 4€{r:r is an even integer, 0 <r < 4} 


Notice that the next worked exercise contains lines of blue text, marked 
with the icons ®. © . You will see similar text in some of the worked 
exercises and proofs throughout this module. This text tells you what 
someone doing the mathematics might be thinking, but would not write 
down; or what a lecturer might say to explain the thinking behind the 
mathematics, but would not write on the board. It should help you 
understand how you might approach a similar exercise yourself. 


2 Sets 


17 


Unit A1 Sets, functions and vectors 


18 


Worked Exercise A3 


Use set notation to specify each of the following. 


(a) The set of all natural numbers greater than 50. 
(b) The set of all odd integers. 


The choice of the variables is arbitrary in these sets, but k for an integer 
and n for a natural number are conventional. 


Exercise A11 


Use set notation to specify each of the following. 

(a) The set of integers greater than —2 and less than 1000. 
(b) The set of positive rational numbers with square greater than 2. 
(c) The set of even natural numbers. 

( 


d) The set of integer powers of 2. 


Set notation is useful when we wish to refer to the set of solutions of one or 
more equations (called the solution set). For example, the real solutions 
of the equation z? = 1 form the set 


{z €R: 2? =1} = {-1,1}. 


The solution set of an equation depends on the set of values from which 
the solutions are taken. For example, the solution set of the equation 


(a — 1)(22 —1) =0 

is 
{z ER: (—1)(2x- 1) = 0} = {1,5 

if we are interested in real solutions. However, the solution set is 
{xz €Z:(x—1)(22 —1) = 0} = {1} 


if we are interested only in integer solutions. In this unit we assume that 
solutions are taken from R unless otherwise stated. 


2 Sets 


Sometimes an equation has no real solutions, so its solution set has no 
elements. 


The set with no elements arises frequently in mathematics, so it is given a 
special name and notation. It is called the empty set and is denoted by 
the symbol @. Thus, for example, 


{z ER: z? = -—1} =Ø. 


The symbol for the empty set, @, was introduced in 1939 by the 
French mathematician André Weil (1906-1998), who took the symbol 
from the Norwegian alphabet. 


André Weil 


2.3 Intervals 


You saw in Subsection 1.1 that the set of real numbers R can be pictured 
as a number line, called the real line. Many sets involve ranges of real 
numbers extending along the real line from one number a to another 
number b. Each of the endpoints a and b may be either included or 
excluded. Such sets are called intervals of the real line, and they occur so 
frequently that we use special notation for them. For example: 


e the interval given by —2 < x < 5, in which both endpoints are excluded, 
is denoted by (—2,5) and is an example of an open interval 


e the interval given by —2 < x < 5, in which both endpoints are included, 
is denoted by [—2,5] and is an example of a closed interval 


e the intervals given by —2 < x < 5 and —2 < x <5, in which one 
endpoint is included and the other is excluded, are denoted by (—2, 5] 
and [—2,5), respectively, and are examples of half-open (or half-closed) 
intervals. 


In some texts, a reversed square bracket is used instead of a round bracket 
to indicate an excluded endpoint; for example |— 2, 5[ is used instead of 
(—2,5) for an open interval. 


We use the symbol oo (infinity) when an interval extends indefinitely far to 
the right on the real line, and the symbol —co when an interval extends 
indefinitely far to the left. For example: 


e the set of all real numbers greater than —3 is denoted by (—3, co) 
e the set of all real numbers less than or equal to 4 is denoted by (—oo, 4]. 


The symbol oo does not denote a real number: instead, it simply means 
that the interval continues indefinitely. We always use round brackets with 
co and —oo. 


The notation for intervals is summarised in the box below. 


19 


Unit A1 Sets, functions and vectors 


Interval notation 
Intervals are denoted as follows. 


Open intervals 


(a,b) (a, co) (S651) (—o0, 00) 
OO | (me) 
a D @ b 
G@ KW <K Sw ie <I} R 
Closed intervals 
[a, b] [a, co) (—0o, 0] (e060) 
a po E 
Oa Loli O >a ge Sw R 


Half-open (or half-closed) intervals 


[a, b) (a, b] 


onm O 


ee ee 


Remarks 


1. In the box above, a hollow dot o indicates that an endpoint is excluded, 
and a solid dot e indicates that an endpoint is included. 


2. A singleton set {a}, containing a single number a, is a closed interval 
whose endpoints are equal. 


3. An interval such as [a, o0) is regarded as closed, rather than half-open 
(or half-closed), because it contains all the real numbers greater than or 
equal to a. However, the interval R = (—oo, co) is considered to be both 
open and closed. 


4. We also use the notation (a,b) to denote a point in the plane, but in 
most cases it should be obvious whether a point or an interval is 
intended. 


Exercise A12 


Which of the following statements are true? 
(a) 1e(1,5) (b) 1e(=1,1] (c) œ €(0,œ) (d) O€ R* 


(e) If x € R*, then z € (0,00). 


Exercise A13 


Use interval notation to specify the following intervals. 


SN om 
(a) 4 0 2 


(b) The set of real numbers x such that —6.5 < x < 21. 
(c) {a €R: a > —273}. 


20 


2 Sets 


2.4 Plane sets 


In Subsection 1.1 you met the plane R?, and saw that each point in the 
plane can be represented as an ordered pair (x,y) with respect to a chosen 
pair of axes. A set of points in R? is called a plane set or a plane figure. 
The lines and circles that you met in Subsection 1.1 are simple examples of 
plane sets. 


Lines as plane sets yA A 


Consider a straight line lų with gradient m and y-intercept c, as illustrated 
in Figure 16. This line is the set of all points (x,y) in the plane such that 


d i gradient m 
y = mg +c. Using set notation, we write this as 


l ={(z,y) ER? :y = mr +e}. > 


(We often refer to ‘the line y = ma +c’ as a shorthand way of specifying : i 
Figure 16 The line l 


this set.) 
For a line lọ parallel to the y-axis with x-intercept a, as illustrated in y i 
Figure 17, we write = 
C= 

lp = {(z,y) € R? : z =a}. 
An alternative way of specifying a line is to write an expression for one or | F = 
both of the coordinates. For example, an alternative way of specifying the 
line lı with equation y = mg + c is Figure 17 The vertical line lə 


lL = {(z,mz +c): x €R}. 


It does not matter what variable we use to specify the line. For example, 
we can also write 


lL ={(t;,mt+c):tER} 


Exercise A14 
(a) Use set notation to specify the line l with gradient 2 that passes 
through the point (0,5). 
(b) Sketch the line l = { (x,y) € R? : y = 1 — z}. 
(c) Sketch the line l = { (x, x£) : x € R}. 


Circles as plane sets 


Consider a circle C with centre (a,b) and radius r, as illustrated in 
Figure 18. This circle is the set of all points (x,y) in the plane such that 
(x — a)? + (y — b)? = r?, so, in set notation, it can be written as 


C = {(x,y) € R? : (x — a)? + (y—b)? = r°}. 


The unit circle U is defined to be the circle centred at the origin with 
radius 1, so it is the set of points (x,y) in the plane whose distance from Figure ie Acids 


21 


Unit A1 Sets, functions and vectors 


U, (x,y) 


Xy 


Figure 19 The unit circle U 


Xy 


Figure 20 The plane split 
into three parts by 1 


h “i 
l N Pe 
| 
Th m 
\ | 
è 
QN d 
im x 
N 
Figure 21 A point P in Hı 
YA 
Də 


ay 


Figure 22 The plane split 
into three parts by the unit 
circle U 


22 


the origin (0,0) is 1 (see Figure 19). In set notation, the unit circle can be 
written as 


U = {(x,y) E R? : 2? +y = 1}. 


Exercise A15 


(a) Use set notation to specify the circle C of radius 3 centred at (1, —4). 
(b) Sketch the circle C = {(x, y) E€ R? : (x — 1)? + (y— 3)? = 4}. 


Half-planes, discs and other plane sets 


Consider the line 
l= {(z,y) ER? :y=1-— zr}. 


This line splits R? into three separate parts, as shown in Figure 20: the 
line / itself, the set Hı of points lying above the line, and the set Hə of 
points lying below the line. 


For any point P = (x,y) in Hj, the point Q = (2,1 — x) lies on the line J, 
directly below P, as illustrated in Figure 21, so y > 1 — zx. Similarly, each 
point (x,y) in Hə satisfies y < 1 — z. Thus 


Hy = {(2,y) ER: y> 1-2} 
and 
Hy = {(2,y) € R?:y < 1-2}. 


The set of points on one side of a line, possibly together with all the points 
on the line itself, is known as a half-plane. A half-plane that does not 
include the points on the line can be specified using set notation as in the 
examples Hı and Hə above. The corresponding half-plane that includes 
the points on the line can be specified by changing the symbol > to >, or 
the symbol < to <. 

When we sketch a plane set that excludes a boundary line, as for the set 
H; in Figure 21, we draw the boundary as a broken line; if the plane set 
includes a boundary line, then we draw the boundary as a solid line. 

We can treat other plane sets in a similar way. For example, consider the 
unit circle 


U ={(2,y) E R? : 27 +y? = 1}. 


This circle splits R? into three separate parts, as illustrated in Figure 22: 
the circle U itself, the set Dı of points lying inside the circle and the 
set Də of points lying outside the circle. 


The condition for a point (x,y) to lie inside U is that the distance of the 
point from the origin is less than 1. It follows that the square of the 
distance of the point (x,y) from the origin is also less than 1, so 


Dı = {(2,y) E R? : 2? +y? < 1}. 


Similarly, 
Də = {(2,y) E R? : 2? +4? > 1}. 


The set of points inside a circle, possibly together with all the points on 
the circle, is known as a disc. Figure 23 shows the disc D; with the broken 
line indicating that the points on the circle are not included in the set. 


If we wish to specify the disc consisting of the unit circle together with the 
points inside it, we replace the inequality < by < in the set notation 
specification of Dı given above, and draw the boundary as a solid line. 


As another example, consider the set of points lying inside the square with 
vertices (0,0), (1,0), (1,1) and (0,1), shown in Figure 24. This set can be 
written as 


{(2,y) € R?:0<2<1,0<y< 1}. 


The square boundary is excluded from this set, and we indicate this by 
drawing the boundary lines as broken lines and the vertices as hollow dots, 
as in Figure 24. 


If we wish our set to include the square boundary, we replace each symbol 
< by <, and we indicate this in a sketch by drawing the boundary lines as 
solid lines and the four vertices as solid dots. 


These conventions for drawing plane sets are consistent with those you met 
earlier for intervals. They are summarised below. 


Convention for drawing sets in R or R? 
In a diagram of a subset of R or R?: 


e included and excluded points are drawn as solid and hollow dots, 
respectively 


e included and excluded boundaries are drawn as solid and broken 
lines, respectively. 


Exercise A16 


Sketch each of the following plane sets. 
(a) {(2,y) €R?:2< 1} 


(b) {(2,y) E R? : y< 2-22} 
(c) {(x,y) E R? : @-1P +@—2) <4} 
(da) {(2,y) € R? : z? + (y +3) > 1} 


Exercise A17 


Use set notation to specify the set of points inside the square with vertices 
(0,1), (2,1), (2,3), (0,3), together with the boundary, and sketch this set. 


2 Sets 


p 
A 
Á 
| Da 
l > 
| T 
\ 
N 
ha 
< 
Figure 23 The disc D, 
YA 
(0, 1)> Be o(1, 1) 
l 
l 
l 
l 
O Ò > 
(0,0) (1,0) 7 
Figure 24 The points inside a 
square 


23 


Unit A1 Sets, functions and vectors 


Figure 25 
set B 


24 


A subset A of a 


2.5 Set equality and subsets 


Consider the sets A = {1,-1} and B= {x € R : z? — 1 = 0}. Although 
these sets are written in different ways, each set contains exactly the same 
elements, 1 and —1. We say that these sets are equal. 


Definition 
Two sets A and B are equal if they have exactly the same elements; 
we write A= B. 


When two sets each contain a small number of elements, we can usually 
check whether these elements are the same, and hence decide whether the 
sets are equal. 


Exercise A18 


Decide whether each of the following is a pair of equal sets. 
(a) A= {2,—3} and B = {x E R: z? +g -—6 = 0}. 


(b) A={keZ:k is odd and0 < k < 8} and 
B={2n+1:n €N and n? < 25}. 


If two sets each contain more than a small number of elements, it is less 
easy to check whether they are equal. You will meet a method for dealing 
with cases like this shortly, but first we need the following idea. 


Consider the sets A = {7,2,5} and B = {2,3,5,7,11}. These sets are 
illustrated in the Venn diagram in Figure 25. Each element of A is also an 
element of B. We say that A is a subset of B. 


Definition 
A set A is a subset of a set B if each element of A is also an element 
of B. We also say that A is contained in B, and we write A C B. 


Do not confuse the symbol C with the symbol €. For example, we write 
{1} C {1,2,3} and 1€ {1,2,3}, 
because {1} is a subset of {1,2,3} and 1 is an element of {1,2,3}. 


We sometimes indicate that a set A is a subset of a set B by reversing the 
symbol C and writing B D A, which we read as ‘B contains A’. 


To indicate that A is not a subset of B, we write A É B. We may also 
write this as B D A, which we read as ‘B does not contain A’. 


The next box gives two simple but important facts about subsets. 


Subsets of every set 

For every set B: 

e B is a subset of itself, that is B C B 

e the empty set Ø is a subset of B, that is, @ C B. 


The first result in the box follows immediately from the definition of a 
subset, given earlier. The second result in the box also follows from the 
definition, since any set B contains every element of the empty set, for the 
simple reason that the empty set has no elements. 


When we wish to determine whether a set A is a subset of a set B, the 
method we use depends on the way in which the two sets are defined. If A 
has a small number of elements, then we can check individually whether 
each element of A is an element of B. Otherwise, we determine 
algebraically whether an arbitrary element of A fulfils the membership 
criteria for B, as illustrated in Worked Exercise A4 below. 


To show that a set A is not a subset of a set B, we need to find at least 
one element of A that does not belong to B. 


Worked Exercise A4 


In each of the following cases, determine whether A C B. 
(a) A= {1,2,—4} and B = {z € R : zř + 4zî — z — 4 = 0}. 
(b) A={(z,y) ER? : z2? +4? <1} and B = {(z,y) ER? : 2 < 1}. 


2 Sets 


25 


Unit A1 Sets, functions and vectors 


26 


Exercise A19 


In each of the following cases, determine whether A C B. 

(a) A= {(5,2), (1,1), (—3,0)} and B = {(z,y) € R? : z — 4y = —3}. 
(b) A={(z,y) E€ R? : x? +y? <1} and B={(z,y) E R?: y <0}. 
(c) A=[-1,0] and B = {z E€ R: (x +1} <1}. 


If a set A is a subset of a set B that is not equal to B, then we say that A 
is a proper subset of B, and we write A C B or B D A. 


In some texts, the symbol C is used to mean ‘is a subset of’ (for which we 
use the symbol C) rather than ‘is a proper subset of’. 


To show that a set A is a proper subset of a set B, we must show both 
that A is a subset of B, and that there is at least one element of B that is 
not an element of A. 


Worked Exercise A5 
Show that A is a proper subset of B, where: 
A= {(z,y) E€ R?:2?+y? <1} and B= {(z,y) € R?: x < 1}. 
(A and B are the sets you met in Worked Exercise A4(b).) 


Exercise A20 


In each of the following cases show that A is a proper subset of B. 
(a) A= {(5,2), (1,1), (-3,0)} and B = {(a,y) € R? : x — 4y = —3}. 
(b) A=[-1,0] and B = {x E R: (x+1)? < 1}. 


(These sets are the same as those in Exercise A19(a) and (c).) 


We now return to the question of how we can show that two sets A and B 
are equal if they have more than a small number of elements. 


If A is a subset of B, we have seen that A is either a proper subset of B or 
is equal to B. Similarly, if B is a subset of A, then B is either a proper 
subset of A or is equal to A. It follows that, if A is a subset of B and B is 
a subset of A, then the two sets A and B must be equal. This gives us our 
strategy. 


Strategy A1 

To show that the sets A and B are equal: 
e first show that A C B 

e then show that B C A. 


2 Sets 


27 


Unit A1 Sets, functions and vectors 


Worked Exercise A6 


Show that the following sets are equal: 
A= {(cost,sin t) : t € [0,27]} and 
B={(z,y) ER? : 27 +77 =1}. 


Solution 


®@. We could specify A by 
A={(z,y) ER a =cost, y=sint for some t € [0,2z]}. @ 


First we show that A C B. 
Let (x,y) be an arbitrary element of A; then (x,y) is a point in R?. 
We have x = cost and y = sint, for some t € [0,27]. So 
r? Hp = cos? t + sin? t = 1. 
This implies that (x,y) € B, so AC B. 
Next we show that B C A. 
Let (x,y) be an arbitrary element of B; then 
mae y? = ll. 
So (x,y) lies on the unit circle. 


®. To show that (x,y) is an element of A, we need to find an angle 
t € [0,27] such that (x,y) = (cost, sint). A sketch will help. @ 


If we take t to be the (anticlockwise) angle from the (positive) x-axis 
to the line joining the point (x,y) with the origin, then t € [0,27], and 


x=cost and y=sint. 
y 
(x,y) 


sint 


> 
J 


It follows that (x,y) € A, so BC A. 
Since A C B and B C A, it follows that A = B. 


28 


2 Sets 


Exercise A21 


In each of the following cases, show that the sets A and B are equal. 
(a) A={(t?,2t):t €R} and B= {(z,y) € R?: y? = 4r}. 
(b) A={(z,y) E€ R? : 2e+y—-3=0} and B={(t+1,1—2t):t € R}. 


2.6 Set operations 


Consider the two sets {2,3,5} and {1,2,5,8}. Using these sets, we can 
construct several new sets — for example: 


e the set {1, 2,3,5,8} consisting of all elements belonging to at least one 
of the two sets 


e the set {2,5} consisting of all elements belonging to both of the two sets 
e the set {3} consisting of all elements belonging to the first set but not 


the second, and the set {1,8} consisting of all elements belonging to the 
second set but not the first. 


Each of these new sets is a particular instance of a general construction for 
sets. We now consider them in turn. 


Union 


You saw above that if A = {2,3,5} and B = {1,2,5,8}, then the set of all 
elements belonging to at least one of the sets A and B is {1,2,3,5,8}. We 
call this set the union of A and B. 


More generally, we have the following definition, which is illustrated by the 
Venn diagram in Figure 26. 


Definition a o) 


Let A and B be any two sets; then the union of A and B is the set AUB 


AUB={a:x€ Aorxve B}. Figure 26 The union of sets 
Aand B 


The word or in this definition is used in the inclusive sense of ‘and/or’; 
that is, the set AU B consists of the elements of A and the elements of B, 
including the elements in both A and B. In everyday language, an example 
of ‘or’ used in the exclusive sense is ‘Tea or coffee?’, since the answer 
‘Both, please!’ is not expected. An example of ‘or’ used in the inclusive 
sense is ‘Milk or sugar?’, since in this case you could answer ‘Both’. 


29 


Unit A1 Sets, functions and vectors 


Worked Exercise A7 


(a) Simplify [—2, 4] U (0, 10). 
(b) Sketch a diagram depicting the union of the half-plane H and the disc 
D, where 
H = {(x,y) ER? : y < 2-22}, 
D = { (x,y) € R? : (x — 1)? + (y—2)? < 4}. 


Solution 


(a) ©. These intervals overlap. .® 


E 
=A 0 4 10 


We have [—2, 4] U (0,10) = [—2, 10). 
(b) ®. These are the half-plane and disc from Exercise A16(b) 


and (c). @ 

YA 

A. 

/ (oa LP 
2, 12 D \+(y-2)?=4 

\ M 

N 7 

= 


®. The union consists of all the points in H or D or both; the two 
points where the circle and line meet are both in the set H and 
so are both in the union H U D and are shown as solid dots. © 


The set A U D is as follows. 


all 
Sv 


30 


2 Sets 


When sketching a set such as that in Worked Exercise A7(b), you should 
include enough detail so that the set is clear, and therefore the axes and an 
indication of scale are essential. Finding the exact points where the circle 
and line meet is not required, but can sometimes be helpful. In this case, 
substituting y = 2 — 2a into the equation for the circle gives 

(a — 1)? + (21)? =4, 
which simplifies to 

ba? — 2a —3 =0. 
This factorises as 

(x — 1)(52 + 3) =0, 


which has solutions x = 1 and x = —3, so the circle and line meet at the 
two points (1,0) and (—2,32). 


Exercise A22 


(a) Simplify (1,7) U [4, 11]. 
(b) Express the set R* as a union of intervals. 
(c) Sketch a diagram depicting the union of the half-plane H and disc D, 
where 
H ={(2,y) ER y < 0}, 
D = {(z,y) ER? : 2? +y? <4). 


So far you have seen the definition of the union of two sets. There is a (\ 
similar definition for the union of any number of sets; for example, the 


union of three sets A, B and C, as illustrated by the Venn diagram in aoe 


Figure 27, is the set 


AUBUC={a:2E€AorxEe BorxrzeEeCc}h. AUBUC 


Intersection Figure 27 The union of sets 


A, BandC 
You saw above that if A = {2,3,5} and B = {1,2,5,8}, then the set of all 


elements belonging to both set A and set B is {2,5}. We call this set the 
intersection of A and B. 


More generally, we have the following definition, which is illustrated by the 


Venn diagram in Figure 28. a D 


Definition 
: : ANB 
Let A and B be any two sets; then the intersection of A and B is 
the set Figure 28 The intersection of 
sets A and B 


ANB={zx:xE€ Aand ze B}. 


31 


Unit Al 


32 


Sets, functions and vectors 


Two sets with no element in common, such as {1,3,5} and {2,9}, are said 
to be disjoint. We write this as {1,3,5} {2,9} = Ø since this 
intersection is empty. 


Worked Exercise A8 


(a) Simplify [—2, 4] N (0, 10). 
(b) Sketch a diagram depicting the intersection of the half-plane H and 
disc D, where 
H ={(2,y) E R? : y < 2 — 2r}, 
D = { (x,y) € R? : (x — 1)? + (y — 2} < 4}. 


Solution 


(a) ®. The intersection is the overlap of these intervals. © 


® oe T * 
=e 0 4 10 


We have [—2, 4] N (0, 10) = (0, 4]. 


(b) ®. These are the half-plane and disc from Exercise A16(b) 
and (c), and Worked Exercise A7. © 


®. The intersection consists of all the points in both H and D. 
Neither of the points where the circle and the line meet are in the 
set D, so these points are not in the intersection H N D, and 
both are shown as hollow dots. .& 


The set HM D is as follows. 


YA 
l2 HAD 
N 
N 
its > 
il ag 


2 Sets 


Exercise A23 


(a) Simplify (1,7) A [4,11]. 


(b) Sketch a diagram depicting the intersection of the half-plane H and 
disc D, where 


H = {(2,y) ER y < 0}, 
D={(a,y) E R? : 2? +y? < 4}. 


(These are the same sets as in Exercise A22(a) and (c).) 


So far you have seen the definition of the intersection of two sets. There is 
a similar definition for the intersection of any number of sets; for example, 
the intersection of three sets A, B and C, as illustrated by the Venn 


diagram in Figure 29, is the set < 


ANBNC={a:reEAandxe B and z EC}. 


Difference ANBNC 
You saw above that if A = {2,3,5} and B = {1,2,5,8}, then the set of all Figure 29 The intersection of 
elements belonging to A but not to B is {3}; we call this set the sets A, B and C 


difference A — B. Similarly, the set of all elements belonging to B but not 
to A is {1,8}; this set is the difference B — A. 


More generally, we have the following definition, which is illustrated by the 
Venn diagram in Figure 30. 


Definition D 


Let A and B be any two sets; then the difference between A and B `M-B 


is the set 
Figure 30 The difference 


A-B={x:x€A, «¢ B}. between set A and set B 


Notice that A — B is different from B — A when A Æ B. This is unlike the 
union and intersection, where AU B = BUA and AN B= BN A, for any 
sets A and B. Also, for any set A, we have A — A = Ø, again unlike the 
union and intersection, where AU A= AN A = A. 


In some texts the difference A — B of two sets A and B is denoted by A\ B. 


33 


Unit Al Sets, functions and vectors 


Worked Exercise A9 


(a) Simplify [—2, 4] — (0,10) and (0, 10) — [—2, 4]. 
(b) Sketch diagrams depicting the differences H — D and D — H of the 
half-plane H and disc D, where 
H ={(2,y) E R? : y < 2 — 2r}, 
D = {(x,y) € R? : (x — 1)? + (y — 2)? < 4}. 


Solution 
(a) ———— y} 
OO 
w Qo T T 
—2 0 4 10 
—_—— 
ee SSD 
T T a . 
—2 0 4 10 
We have 


[—2, 4] — (0, 10) = [—2, 0], 
(0, 10) — [—2, 4] = (4, 10). 


(b) ©. Again these are the half-plane and disc from Exercise A16(b) 
and (c), and Worked Exercises A7 and A8. © 


®. Consider carefully the boundary points, and in particular, the 
points where the line and circle meet. Both of the meeting points 
are in H — D, as are the remaining points of the boundaries. 
Neither of the meeting points is in the difference D — H, nor are 
the remaining points of the boundaries. .©& 


The sets H — D and D — H are as follows. 


Sv 
Sv 


34 


3 Functions 


Exercise A24 


(a) Simplify (1,7) — [4,11] and [4,11] — (1,7). 
(b) Sketch diagrams depicting the differences H — D and D — H of the 
half-plane H and disc D, where 
H = {(x,y) ER? : y < 0}, 
D={(a,y) E R? : 2? +y? < 4}. 


(These are the same sets as in Exercise A22(a) and (c), and Exercise A23.) 


3 Functions 


In this section you will revise what is meant by a function, and some 
associated ideas. You will look at not only functions of real numbers, but Cottfied Wilhelm Leibniz 
also functions of other mathematical objects. The idea of a function is 
fundamental throughout this module, so it is vital that you have a good 
understanding of this topic. 


The term ‘function’ first emerged at the end of the seventeenth 
century in the correspondence of Gottfried Wilhelm Leibniz 
(1646-1716) and Johann Bernoulli (1667-1748). But it was Leonhard 
Euler (1707-1783) in the middle of the eighteenth century who was 
responsible for the essential development, notably through his 
Introductio in Analysin Infinitorum of 1748, the first work in which 
the concept of a function plays an explicit and central role. 


3.1 What is a function? 


You can think of a function as a machine for processing mathematical 
objects, such as numbers, points in the plane or vectors. 


Johann Bernoulli 


For example, consider the function f that takes non-zero real numbers as 
its inputs and whose rule is that the input x leads to the output 


f(x) =1/x. You can regard it as a machine that calculates the reciprocals 
of its input numbers. When 3 is fed into the machine, out comes š; when 

—2 is fed into the machine, out comes —4; and so on. Any real number in 
the domain R* of f can be processed by the machine to produce a real 


number in the codomain R of f, as illustrated in Figure 31. 


Leonhard Euler 


35 


Unit A1 Sets, functions and vectors 


domain A codomain B 


Figure 33 A general function 


36 


domain R* codomain R 


Figure 31 A function as a machine 


Similarly, consider the function g that accepts points in the plane as its 
inputs and whose rule is that the input (x,y) leads to the output 
g((x,y)) = y. You can regard it as a machine that calculates the 
y-coordinate of each input point. When the point (1,2) is fed into the 
machine, out comes 2; when the point (0,0) is fed into the machine, out 
comes 0; and so on. Any point in the domain R? of g can be processed by 
the machine to produce a real number in the codomain R of g, as 
illustrated in Figure 32. 

g 


— 


(ow = y 


domain R? codomain R 
Figure 32 Another function as a machine 
In general, imagine a machine that accepts an element x from some set A, 
and processes it to produce a single element f(x) in some set B. This 


machine corresponds to the following general definition of a function, 
which is illustrated in Figure 33. 


Definition 

A function f is defined by specifying: 
e a set A, called the domain of f 

e a set B, called the codomain of f 


e a rule z f(x) that associates each element x € A with a unique 
element f(x) € B. 


The element f(x) is the image of x under f. 


Symbolically, we write 
f:A—B 
cr f(z). 


We often refer to a function as a mapping, and say that f maps A to B 
and z to f(x). 


Notice that the definition of a function does not require every element of 
the codomain B to be the image of an element of the domain A, but it 
does require every element of the domain A to have an image in the 
codomain B. For example, a function with rule x +> sin z and domain R 
could have codomain R, or [—1, 1], or any set of real numbers of which 
[—1, 1] is a subset, but not, say, codomain [0,1] since the image of 37/2 is 
sin(37/2) = —1, which is not in this set. 


Notice also that the symbolic definition of a function given at the end of 
the box above specifies all three of the constituent parts of a function at 
once: the domain, the codomain and the rule. For example, the definition 


f:Z— Z 
n> n+ 1 
specifies a function with domain Z, codomain Z and rule f(n) =n+ 1. 
When we write a function symbolically, the first arrow is unbarred to 
signify a mapping from the domain A to the codomain B. The second 


arrow is barred, to show that the particular element x of A is mapped to 
the particular element f(x) of B. Each arrow is read as ‘maps to’. 


The following paragraphs give a number of examples of different types of 
functions. 


Real functions 


A function whose domain and codomain are both subsets of R is called a 
real function. Examples include the functions 


ip i and oo = 
tro > z> 2g — 5. 
x 


In some texts, a real function is defined to be a function whose codomain 
is a subset of R, but whose domain can be any set. 


You may be more familiar with seeing these functions written as simply 

f(x) =1/x and g(x) = 2x — 5. We write functions in this shortened way 
when it is understood from the context what the domain and codomain 

are. 


Distance function 
Functions of the form f : R? — R can be used to specify quantities 
associated with points in the plane. For example, the function 
f:R?—R 
(z,y)-> V2? +y? 
gives the distance of each point (x,y) in the plane from the origin, as 
shown in Figure 34. 


3 Functions 


Figure 34 ‘The distance of a 
point from the origin 


37 


Unit A1 Sets, functions and vectors 


Transformations of the plane 
Functions that have a geometric interpretation are often called 
transformations. Such functions include translations, reflections and 
rotations of the plane. We now look at some simple examples. For each 
one, the diagram shows the effect of the transformation on the square 
whose vertices are at (0,0), (1,0), (1,1) and (0,1); part of the square is 
shaded for clarity. 
e The transformation 
f:R?-R 
(x,y) > (x + 2,y) 
is the translation of the plane that shifts (or translates) each point to 
the right by 2 units, as illustrated in Figure 35. 


y 


: f 


L 2 wa %7 
Figure 35 ‘Translation 2 units to the right 
e The transformation 
f: RR 


(z, y) — (=t; y) 
is the reflection of the plane in the y-axis, as illustrated in Figure 36. 


y y 


Figure 36 Reflection in the y-axis 


e The transformation 
f: R — R? 
(x,y) +> (=a; —y) 
is the rotation of the plane through an angle m about the origin, as 
illustrated in Figure 37. 


38 


| 
a 

| 

— 

1 
=. -4 
Sv 
=J 
Sv 


Figure 37 Rotation through an angle m about the origin 


When specifying a function, like a transformation, where the elements of 
the domain are of the form (x,y), we simply write f(x,y) rather than 


f(x,y). 


Exercise A25 


For each of the following functions f : R? — R?, state whether f is a 
translation, reflection or rotation of the plane. 

(a) f(x,y) = (@+2,y +3) 

(b) f(x,y) = (x, -y) 

(c) f(x,y) = (=y, z) 


Functions whose domains are finite sets 


It is often useful to consider a function whose domain is a finite set. For 
example, we can define a function whose domain and codomain are the set 


A = {0,1,2,3,4,5,6,7,8,9} 


by 
f: ATA 
tis Y— g: 
When the domain of a function f has a small number of elements, we can 
specify the rule of f by listing the image f(x) of each element x in the 
domain. For example, let A = {0,1,2,3} and B = {2,3,4,5}; then we can 
define a function f : A — B by the rule 


f(0)=2, fQ)=2, f(2)=4, f(3) =5. 


We can represent the behaviour of this function by a diagram, as shown in 
Figure 38. A diagram of this type that represents a function always has 
the following features: 


e there is exactly one arrow from each element in the domain, since each 
element in the domain has exactly one image in the codomain 


e there may be no arrows, one arrow or several arrows going to an element 
in the codomain, since an element in the codomain may not be an image 
at all, may be an image of exactly one element in the domain, or may be 
an image of several elements in the domain. 


3 Functions 


pa 
\ T 
7 

A B 


Figure 38 Function f from 
set A to set B 


39 


Unit A1 Sets, functions and vectors 


In the example shown in Figure 38, the number 3 is not an image at all, 5 
is the image of 3 only, and 2 is the image of both 0 and 1. 


Exercise A26 
Which of the following diagrams represent(s) a function? 
(a) (b) (c) 


Ee) (SER) (Ag 


Identity functions 


Associated with any set A, there is a particularly simple function whose 


Pisin domain and codomain are the set A. This is the identity function iå, 
which maps each element of A to itself. (We sometimes omit the subscript 
ar, A if we do not need to emphasise the set.) 
EE For example, let A = {0, 1,2,3}; then the rule of the identity function i4, 
aR as illustrated in Figure 39, is 
A A ia(0)=0, ta(1)=1, t4(2)=2, i4(3)=3. 
Figure 39 An identity The following definition applies to any set A, finite or infinite. 
function 
Definition 


The identity function on a set A is the function 
ia: A—A 
T= T. 


3.2 Image set of a function 


The rule associated with a function tells us how to find the image of any 
element in the domain. Often, however, we need to consider the images of 
all elements in some subset of the domain. The subset of the codomain 
containing these images is called the image of the original subset, as stated 
below and illustrated in Figure 40. 


Definition 


Let f : A—> B be a function. For any subset S of A, the image of S$ 
under f, denoted by f(S), is the set 


domain A codomain B 


Figure 40 Image of a set S 
under a function f TO) = {ee E Sj- 


40 


3 Functions 


Worked Exercise A10 


Find f(S), where S = {1,2,3} and 


f:R— R 
1 
LS =; 
Ey be 
Solution 


f(S) = {f@), f@), £(3)} = {1 $, 4}. 


Exercise A27 


Let 
f:R— R 
trou. 
Find the image under f of each of the following sets. 
(a) -S =10,1,2,2} b) Z 


The idea of the image of a subset of elements is useful in geometry, for 
example, where we frequently want to consider the effect of a 
transformation on a plane figure, a subset of R?. For example, suppose 
that S is the square with vertices at (0,0), (1,0), (1,1) and (0,1), and we 
want to find the image of S under the function 

f:R?—-R 

(x,y) > (x + 2,y). 
This function is the translation of the plane that moves each point (x,y) to 
the right by 2. The image of S is therefore the square with vertices at 
7 (0,0) = (2,0), 71,0) = (3,0), f£(1,1) = (3,1) and 7 (0,1) = (2,1), as 


shown in Figure 41. 


y 


Figure 41 The image f(S) of a square S under a translation f 


Sometimes we want to consider the image of the whole domain of a 


function: this set is called the image set of the function, as illustrated in domain A codomain B 
Figure 42. Figure 42 Image set of a 
function f 


41 


Unit A1 Sets, functions and vectors 


Definition 
The image set of a function f : A —> B is the set 


f(A) = (f(a) : x € A}. 


The image set of a function is a subset of its codomain. It need not be 
equal to the codomain because there may be some elements of the 
codomain that are not images of elements in the domain. 


In some texts, the image set of a function is called the image of the 
function, or the range of the function. 


When the domain of a function f has a small number of elements, we can 
find the image set of f by finding the image of each element in the domain, 
and listing them to form a set. 


Worked Exercise A11 


Let A = {—3, —2, —1,0, 1,2,3} and B = {0,1, 2,3, 4, 5, 6, 7, 8, 9}. 
Find the image set of the function 
f:4— B 
re r’. 


Exercise A28 


Let A = {0, 1, 2,3, 4, 5, 6, 7, 8, 9}. 
Find the image set of the function 
f:A4— A 
ti 9 = i. 


You should have found that for the particular function in Exercise A28 the 
image set and the codomain are the same set. In other words, each 
— element of the codomain is the image of an element in the domain, as 


domain A codomain B illustrated in Figure 43. A function with this property is said to be onto. 
Figure 43 An onto function: 
FA) =B 


42 


Definition 
A function f : A — B is onto if f(A) = B. 


Some texts refer to an onto function as a surjective function. 


Exercise A29 
Which of the following diagrams represent(s) an onto function? 


(a) (b) (c) 


‘pe Lf Sa 


You have seen that if the domain of a function is a small finite set, then we 
can find the image set of the function by finding the image of each element 
of the domain individually. If the domain is a large finite set or an infinite 
set, then we need an algebraic argument to determine the image set. 
Sometimes we ‘guess’ what the image set seems to be, and then confirm 
this algebraically. 


For a real function, a sketch of its graph can help us ‘guess’ the image set. 
For a function that is a transformation of the plane, we can use our 
knowledge of such transformations to help us ‘guess’ the image set. 


To show that the image set is equal to our ‘guess’ set, we use our usual 
strategy for showing that two sets are equal: we show that each is a subset 
of the other. 


e To show that the image set is a subset of our ‘guess’ set, we show that 
the image of an arbitrary element of the domain lies in our ‘guess’ set. 

e To show that our ‘guess’ set is a subset of the image set, we take an 
arbitrary element of our ‘guess’ set and find an element of the domain 
whose image is this arbitrary element. 


Worked Exercise A12 


For each of the following functions, find its image set and determine 
whether it is onto. 
(a) f:R— R (b) f:R—R (c) f : R? — R? 

Lr— 24-5 cr x (x,y) — («+ 1,y+2) 


3 Functions 


43 


Unit A1 Sets, functions and vectors 


Solution 
(a) A sketch of the graph of f is shown below. 


®. For every element on the y-axis, a horizontal line drawn 
through that element meets the graph. So it seems that every 
element of the codomain is the image of some element of the 
domain. That is, we ‘guess’ that the image set f(IR) is the whole 
codomain R. & 


We prove that f(R) =R. 


®. The image set is always a subset of the codomain; in this case 
the codomain is R, so f(R) CR. @ 


We know that f(R) C R, so we must show that f(R) 2 R. 


@®. We take an arbitrary element in our ‘guess’ set R, and find an 
element in the domain R whose image is this arbitrary 
element. © 


Let y be an arbitrary element in R. We must show that y € f(R); 
that is, there exists an element x in the domain R such that 


(co) =a that is, 225 E= 


Rearranging this equation, we obtain 


popes a 
7 2 
which is in the domain R. So we have 
(e227 —5 
y+s 
=2| =— ]—5 
oy 
= oh 
that is, for every y € R there is an x in the domain R such that 
f(z) =y. 


Thus f(R) 2 R. 


Since f(R) C R and f(R) 2 R, it follows that f(R) = R, so the 
image set of f is indeed R. 


The codomain of f is also R, so f is onto. 


44 


(b) A sketch of the graph of f is shown below. 


y 


®. For every element in the interval [0, o0) of the y-axis (marked 
on the sketch), a horizontal line drawn through that element 
meets the graph. For any element outside this interval, such a 
horizontal line does not meet the graph. So we ‘guess’ that the 
image set f(R) is [0,co). & 

We prove that f(R) = [0, 00). 

®. We know that the image set is a subset of the codomain R, 
but we don’t know that it is a subset of [0, o0). We have to show 
algebraically that f(R) C [0,00) by finding the image of an 
arbitrary element in the domain R. .© 

Let x be an arbitrary element in the domain R; then f(x) = 2”. 
Now, x? > 0 for all x € R, so f(R) C [0, co). 

We must now show that f(R) 2 [0, 00). 

®. We take an arbitrary element in our ‘guess’ set [0,00), and 
find an element of the domain R whose image is this arbitrary 
element. .©& 


Let y be an arbitrary element in [0,00). We must show that 
there exists an element x in the domain R such that 


f(z)=y; thatis, «7 =y. 
Now z = ,/y is in R (since y > 0) and satisfies f(x) = y, as 
required. Thus f(R) 2 [0, 00). 
Since f(R) C [0,00) and f(R) 2 [0, 00), it follows that 
f(R) = [0, co), so the image set of f is [0, 00). 
The image set f(R) = [0, 00) is not the whole of the codomain R, 
so f is not onto. 


®. If we had simply been asked to determine whether f is onto, 
we could have shown that it is not by finding just one element, 
say —1, in the codomain R that is not the image of an element of 
the domain R. @& 


®. This function is a translation of the plane (it shifts each point 
to the right by 1 unit and up by 2 units). So we expect (‘guess’) 
the image set to be the plane R?. & 


We prove that f(R?) = R?. 


3 Functions 


45 


Unit A1 Sets, functions and vectors 


46 


Exercise A30 


For each of the following functions, find its image set and determine 
whether it is onto. 
(a) f:R— R (b) f :R?— R? 

r l +z? (x,y) — (z, —y) 


As you have seen from Worked Exercise A12 and Exercise A30, when you 
want to determine whether a function is onto, it is crucial to take into 
account what the codomain of the function is. For example, you saw in 
Worked Exercise A12 that the function 


f:R—R 
Er x 


is not onto. To see this, you just have to observe that the element —1, for 
example, of the codomain is not the image of any element of the domain. 
However, if you remove all the negative numbers from the codomain of this 
function, then you obtain the new function 


g : R — [0, 00) 


rr x7, 


which is onto, since every element of the codomain is an image of an 
element of the domain. Note that these functions f and g are different 
functions, since they have different codomains. 


3.3 Inverse functions 


Given a function 
f:A—B 
z ı—> f(z), 
it is sometimes possible to define an inverse function that ‘undoes’ the 
effect of f by mapping each image element f(x) back to the element x 
whose image it is. For example, a rotation in the plane can be ‘undone’ by 
a rotation in the opposite direction. 
However, consider the function 
f:A—-B 
ze 2, 
where A = {—3,—2, —1,0, 1,2,3} and B = {0,1, 2,3, 4,5, 6, 7,8, 9}. 
We know that f(—2) = f(2) = 4, and so a function that ‘undoes’ the effect 
of f must map the number 4 to the number —2 and to the number 2, 
which is impossible. Thus, in this case, no inverse function exists. This 
function f is an example of a function that is many-to-one. A many-to-one 
function does not have an inverse function. 


Definitions 


A function f : A — B is one-to-one if each element of f(A) is the 
image of exactly one element of A; that is, 


if 71,22 € A and f(x1) = f(z2), then x71 = xe. 


A function that is not one-to-one is many-to-one. 


Thus a function f is one-to-one if it maps distinct elements in the domain 
A to distinct elements in the image set f(A). Some texts refer to a 
one-to-one function as an injective function. 


To prove that a function f is not one-to-one (that is, that the function is 
many-to-one), it is sufficient to find just one pair of distinct elements in 
the domain A with the same image under f. 


Exercise A31 
Which of the following diagrams represent(s) a one-to-one function? 


(a) (b) (c) 


va \_-4 bd 


3 Functions 


47 


Unit A1 Sets, functions and vectors 


48 


If the domain of a function is a large finite set or an infinite set, then to 
show that the function is one-to-one, we need an algebraic argument. We 
aim to show algebraically that, if two elements of the domain have the 
same image under the function, then they must actually be the same 
element, as demonstrated in Worked Exercise A13. 


Showing that a function is not one-to-one is more straightforward: we just 
give a pair of distinct elements that have the same image under the 
function, as you have seen. 


For a real function, an initial sketch of its graph can help us ‘guess’ 
whether or not the function is one-to-one, and if it is not one-to-one, the 
graph can also help us find a pair of elements that show this. 


Worked Exercise A13 


Determine which of the following functions are one-to-one. 
(a) f:R— R (b) f:R— R (c) f: R? — R? 
Lr 2r —5 rH x? (x,y) — (z + 1,y +2) 


(These are the same functions as in Worked Exercise A12.) 


Solution 
(a) A sketch of the graph of f is shown below. 


®. Each horizontal line meets the graph just once. So it seems 
that no element of the codomain is the image of more than one 
element of the domain. That is, it seems that f is one-to-one. To 
prove this, we show that if two elements x; and x2 in the domain 
have the same image, then they must actually be the same 
element. © 


We show that f is one-to-one. Suppose that f(x1) = f(x2); then 
2x1 — 5 = 242 — 5, 
so 2x1 = 2x2, and hence x1 = 2. 


Thus f is one-to-one. 


(b) A sketch of the graph of f is shown below. 


| 
| 
—?2 2 T 


®. Some horizontal lines meet the graph more than once. So it 
seems that f is not one-to-one. To show this, we find two distinct 
elements of the domain with the same image. © 


This function is not one-to-one since, for example, 
P) 2 a 


(c) ®. This function is a translation of the plane, so we expect it to 
be one-to-one. .@ 


We show that f is one-to-one. Suppose that f(x1, y1) = f(a, y2); 
then 


(Con ae L0 = 2) = Ca F lya F 2): 
Thus 

a+l=a24+1 and y+2=y.4+2, 
so 

tı=qďđ2 and yi = yo. 


Hence (21, y1) = (£2, y2), so f is one-to-one. 


Exercise A32 


Determine which of the following functions is one-to-one. 
(a) f:R—R (b) f:R?— R? 
pi Tea (x,y) > (z, —y) 


(These are the same functions as in Exercise A30.) 


For a one-to-one function f : A —> B, we have the situation illustrated in 
Figure 44. Each element y in f(A) is the image of a unique element x in 
A, and so we can reverse the arrows to obtain the inverse function with 
domain f(A) and image set A, which maps y back to x. When it exists, we 
denote the inverse function of f by f7!. 


3 Functions 


domain image 


f(A) set A 


Figure 44 A function and its 
inverse 


49 


Unit A1 Sets, functions and vectors 


50 


Definition 
Let f : A — B be a one-to-one function. Then f has an 
inverse function f~t : f(A) — A, with rule 


fy) = where y = 9 (a) 


Notice in this definition that the domain of f~ is f(A); it is not 
necessarily the whole of B. 

However, if a function f : A —> B is onto, as well as one-to-one, then f 
has an inverse function f~! : B — A; that is, the domain of f~! is the 
whole of B. 

A function f : A —> B that is both one-to-one and onto is said to be a 
one-to-one correspondence between the sets A and B. For such a 
function f, not only is f~! the inverse of f, but also f is the inverse of 
f~}; that is, the functions f and f~! are inverses of each other. 


Some texts refer to a one-to-one correspondence as a bijection. 


Worked Exercise A14 


For each of the following functions, determine whether f has an inverse 
function f—!; if it exists, find it. 
(a) f:R— R (b) f:R— R (c) f : R? — R? 

Lr— 2r — 5 re r? (x, y)-—> (z +1,y4+ 2) 


(d) f: [0,c0) — [-1, ov) 
rı 3r? — 1 


(d) 


®. It does not matter whether the definition of f~! is expressed 
in terms of x or y, but it is more usual to use x in the definition 
of a real function. & 


This definition can be expressed in terms of x as 
R= R 


PE A 
= 


In Worked Exercise A13(b), we showed that f is not one-to-one, 
so f does not have an inverse function. 


In Worked Exercise A13(c), we showed that f is one-to-one, so f 
has an inverse function. 


In Worked Exercise A12(c), we showed that the image set of f is 
R? and that, for each (a’,y’) in the image set R?, we have 


(2’,y') = f(a,y) = f(x’ -1,y’ — 2). 
@. Under f, we know that (z’, y’) is the image of (2’ — 1, y’ — 2), 
so under the inverse, (x’ — 1, y' — 2) is the image of (z’,y’). #& 
So fT! is the function 

fk? 3 RP? 

Ge ly 2) 

This definition can be expressed in terms of x and y as 

fC R — R? 

(x,y) — (z — 1,y-— 2). 

®. This makes sense: geometrically, f is the translation that 
shifts each point to the right by 1 unit and up by 2 units, so we 


expect the inverse to be a translation to the left by 1 unit and 
down by 2 units. & 


A sketch of the graph of f is shown below. 


3 Functions 


51 


Unit A1 Sets, functions and vectors 


52 


®. Each horizontal line meets the graph just once. So it seems 
that f is one-to-one. To prove this, we show that if two elements 
zı and x2 in the domain have the same image, then they must 
actually be the same element. ©& 


We show that f is one-to-one. Suppose that f(x1) = f(x2); then 
37, lsg =l, 

so 3x? = 3x3, and hence z? = x3. Since both zı and 2 are in the 

domain [0,00), this implies that zı = x2. 

Thus f is one-to-one. 


@. We now find the image set of f. From the sketch, we ‘guess’ 
that it is [—1,00), the codomain of f. That is, we guess that f is 
onto. & 

We prove that f([0,00)) = [—1, c0). 

®. The image set is a subset of the codomain. .© 

We know that f([0,00)) C [—-1, 00), so we must show that 

Ff (0, 00)) 2 [-1, ov). 

@®. We take an arbitrary element in our ‘guess’ set [—1,00), and 


find an element of the domain [0,00) whose image is this 
arbitrary element. © 


Let y be an arbitrary element in [—1,00). We must show that 
there exists an element x in the domain [0, o0) such that 

f(z)=y; thatis, 327—1=y. 
Rearranging this equation, we obtain 

g url 
r=. 
3 

Since y € [—1, 00), we know y + 1 > 0, so 


m = Uae 
= 


is in the domain [0, 00). So we have 
f(a) =327 —1 


=, 
that is, for every y € [—1, 00) there is an x € [0, 00) such that 


f(x) =y. 


Exercise A33 


For each of the following functions, determine whether f has an inverse 

function f~t and, if it exists, find it. 

(a) f:R— R (b) f : R? — R? (c) f:R— R 
rm l +r? (x,y) — (z, -y) re> 8r +3 


(For parts (a) and (b), use your answers from Exercises A30 and A32.) 


Restrictions 


When we are working with a function f : A —> B, it is sometimes 
convenient to restrict attention to the behaviour of f on some subset C of 
A. For example, consider the function 
f:R—R 
ge r. 
This function is not one-to-one and so does not have an inverse function. 
However, if the domain of f is replaced by the set C = [0, 00), then we 
obtain a related function, 
g:C — R 
r= r?°, 
shown in Figure 45. The rule is the same as for f, but the domain is 


‘restricted’ to produce a new function g that is one-to-one and so has an 
inverse. 


The function g is an example of a restriction of f in the sense that 
g(x) = f(x) for all x in the domain of g. 


3 Functions 


Figure 45 The function g 
with domain [0, co) 


53 


Unit A1 Sets, functions and vectors 


More generally, we define a restriction, illustrated in Figure 46, as follows. 


Definition 
Let f : A —> B and let C be a subset of the domain A. Then the 
function g : C —> B defined by 


g(a) = f(x), fora ec, 


is the restriction of f to C. 


Figure 46 The function g is 
the restriction of f to C 


Exercise A34 


Let f be the function 
f :R — |-1,1] 
re sinz. 


Write down a restriction of f that is one-to-one. 


3.4 Composite functions 


In Subsection 3.1, you saw how a function may be regarded as a machine 
that processes elements in the domain to produce elements in the 
codomain. Now suppose that two such machines are linked together, so 
that the elements emerging from the first machine are fed into the second 
machine for further processing. The overall effect is to create a new 
‘composite’ machine that corresponds to a so-called composite function. 


For example, consider the real functions 


f:R— R g:R—R 
2 and 
Cr> T DH 2g — 5. 
When the machines for f and g are linked together so that elements are 


first processed by f and then by g, we obtain the ‘composite’ machine 
illustrated by the large box in Figure 47. 


gof 
— 
f g 
— — 
4 


2—5 z z pe Ct 20 e3 


Figure 47 The composite function go f as a machine 


For instance, when 2 is fed into the machine, it is first squared by f to 
produce the number 4, and then 4 is processed by g to give the number 
(2x 4)-5=3. 


54 


Similarly, when an arbitrary real number « is fed into the machine, it is 
first processed by f to give the real number x”. Since z? lies in R, the 
domain of g, the number z? can then be processed by g to give 2x? — 5. 
Thus, overall, the composite machine corresponds to a function, which we 
denote by go f, whose rule is 


(90 f)(x) = 9(f(x)) = g(a) = 2? — 5. 


In general, we have the following definition. 


Definition 
Let f : A — B and g: B — C be two functions such that the 
domain of g is the same set as the codomain, B, of f. Then the 
composite function go f is given by 
gof: A— C 
z> g(f(x)). 


Notice that go f means f first, then g. 


Exercise A35 


Let f and g be the functions 


f:R— R g:R— R 
and 
LH > -T rr 3z + 1. 


Determine the composite functions 


(a) gof, (b) fog. 


In general, the composite functions go f and f o g are not equal, as you 
saw in Exercise A35. 


Composite functions have many uses in mathematics; for example, we can 
use them to examine the effect of one transformation of the plane followed 
by another. 


Suppose, for instance, that f and g are the reflections of the plane in the 
x-axis and y-axis respectively: 
f:R — R? g:R — R? 
and 
(x,y) —> (x, —y) (x,y) —> (=x, y). 
The composite function g o f describes the overall effect of first reflecting 


in the x-axis (changing the sign of y) and then reflecting in the y-axis 
(changing the sign of x), as shown in Figure 48. The rule of go f is 


(go f)(z,y) = 9(f(z,y)) = g(x, —y) 
= (—x,—y). 


3 Functions 


YA 
(x,y) 
e 
gof fo 
x 
e<__ ® 
(=z, —y) g (f; —y) 
Figure 48 The composite 
gof 


55 


Unit A1 Sets, functions and vectors 


Thus go f is the function 
go f: R? — R? 
(x,y) => (=f; —y), 
which rotates the plane through an angle 7 about the origin, as can be 
seen by considering Figure 49, which shows how a square is transformed by 


gof. 
gof 
e , 
f 1 F 1 


“T (1,-1) 71 


Figure 49 The composite function go f transforming a square 


Exercise A36 


Determine the composite function f o g, where f and g are the reflections 
of the plane in the z-axis and y-axis respectively, as defined above. 


So far, we have considered the composite function g o f only when the 
domain of the function g is the same as the codomain of the function f. 
We can, however, form the composite function go f when g and f are any 
two functions. 


For example, consider the functions 


f:R— R g:R-{1}—R 
j and 1 
Li 2 x£ eed 


x—-1 
Recall that R — {1} is the set of all real numbers with 1 excluded. 


Here the domain of g is not equal to the codomain of f, but we can still 
consider the composite function go f, with the rule 


(go f)(2) = 9(f(x)) = g(a?) = 


v2 — 1 


56 


However, we have to be careful about the domain of go f. It cannot be the 
whole of R, the domain of f. To see this, consider what happens when we 
try to feed the number 1 into the ‘machine’ corresponding to go f, as 
shown in Figure 50. 


1 
1e xr > x? l cr > ‘ieee output 
| T= 


Figure 50 An input number that cannot be ‘processed’ by go f 


If we try to feed the number 1 into the machine, then it can be processed 
by f to produce the number 1, but 1 cannot then be processed by g, since 
it is not in the domain of g. We have the same problem if we try to feed 
the number —1 into the machine. However, if we feed any other number in 
the domain of f into the machine, then it can be processed by f and then 
g to produce a final output number. So we take the domain of go f to be 
R — {1,—1}. Thus the composite function go f is 

gof:R-{1,-1} — R i 
In general, if f and g are any two functions, then we take the domain of 
the composite function go f to consist of all the elements in the domain of 
f such that f(a) is in the domain of g. The codomain of go f is always the 
same as the codomain of g. So we have the following definition. 


Definition 
Let f : A — B and g: C — D be any two functions; then the 
composite function go f has: 


e domain {x € A: f(x) € C} 
e codomain D 


G rule (go NE = 9G (a) 


This definition allows us to consider the composite of any two functions, 
though in some cases the domain may turn out to be the empty set Ø. 
However, some texts insist on f(A) C C as a condition to ensure go f 
exists. 


3 Functions 


57 


Unit A1 Sets, functions and vectors 


In the example above with 
gof:R—-{1,-1} — R 
1 
x? — 1’ 
the domain of g o f is just the set of values for which the rule of go f is 
defined. This is not always the case, as illustrated in the following worked 
exercise where the domain of f is not the whole of R. 


Lt 


Worked Exercise A15 


Determine the composite function go f for the following functions f and g: 


f : [0, 27) — [-1,1] 


g:R-{-1} — R* 
ee sina 1 


e+ 


Solution 
®. The composite function g o f means f then g. ® 
The rule of g o f is 


Gee) =o seme) = —— 


sinz +1 


@. A number z is in the domain of go f if it is in the domain of f and 
f(z) is in the domain of g. & 


The domain of go f is 
{x € [0,27) : f(z) € R— {-1}}. 
If x € [0, 27), then f(z) € R— {—1} unless f(x) = —1. 


Now f(x) = —1 means sinx = —1, and the only value of x in [0, 27) 
such that sinz = —1 is 
_ 3a 
Dp 


®. The domain is complicated to write down so it helps to give it a 
name, say D. @ 


So the domain of go f is 
D = (0, 27) — {37/2}. 
Thus go f is the function 
go f: D — R* 
1 
sing +1 


58 


Notice that, as claimed, in the worked exercise above the domain of go f is 
not the full set of values for which go f is defined. The full set of values for 
which g o f is defined is 


{x €R:sing £-1}=R-{(2n-4)a7:neEZ}. 


Exercise A37 
Determine the composite function go f for the following functions f and g: 


f:[-11J—-R ee a 
rte 3x + 1 


eS ‘ 
xr+2 


Using function composition to show that a function is 
the inverse of another function 


Suppose that f : A —> B is a one-to-one and onto function. Then f has 
an inverse function f~t : B — A. We can therefore consider the effect 
that the composite function f~! o f : A — A has on an arbitrary element 
z in A. First, f maps x to an element y = f(x) in B. Then f~! ‘undoes’ 
the effect of f and maps y back to x, as illustrated in Figure 51. Overall, 
the effect of f~t o f is to leave x unchanged, or fixed: that is, 

(f-!o f)(x) =a. Since x is an arbitrary element of A, it follows that 

f ‘of fixes all the elements of A. In other words, f~!o f = i4, the 
identity function on set A. 


Figure 51 The composite function f~!o f 


A similar argument can be used to show that f o f~! = ig. So, if 
f: A — B has an inverse function f~! : B — A, then 


f-iof=i,g and fof '=ig. 


The converse of this statement is also true: that is, if a function 
g : B — A satisfies 


g°of=ia and fog=iz, 


then g is the inverse function of f. A proof of this is given after 
Exercise A39. It leads to the following strategy. 


3 Functions 


59 


Unit A1 Sets, functions and vectors 


Strategy A2 


To show that the function g : B —> A is the inverse function of the 
function f : A —> B: 


1. show that g(f(x)) =z for each x € A; that is, go f =i, 
2. show that f(g(y)) = y for each y € B; that is, fog = ip. 


In practice, we can sometimes use Strategy A2 as an alternative way of 
finding an inverse function. We make an inspired guess at the inverse 
function, and use Strategy A2 to check that our guess is correct. 


Worked Exercise A16 


Use Strategy A2 to find the inverse of the function 
f:R—R 


Exercise A38 
Use Strategy A2 to show that g is the inverse of f, where 


f:R—OR ee 
and r+3 
cr > ba —3 a z 


60 


Exercise A39 


Use Strategy A2 to find the inverse of the function 
f:R—-R 


To end this section, here is the promised proof that if the functions 
f:A— B and g: B — A satisfy 
gof=ta and fog=ip, 


then g is the inverse function of f. That is, we prove that if the two steps 
of Strategy A2 hold, then f has an inverse function, and the inverse 
function is equal to g. 


Suppose, then, that the two steps of Strategy A2 hold. First we show that 
f is one-to-one. 


Suppose that f(x1) = f (x2); then 


g(f(x1)) = g(f(x2)), 


so, since g(f(x)) = x for each x € A by the first step of Strategy A2, we 
have zı = zə. Thus f is one-to-one and so it has an inverse function f~t. 


Now we find the image set of f. 


We know that the image set of f is a subset of its codomain B, so 

f(A) C B. We now show that f(A) D B by showing that every element y 
of B is the image under f of some element in A. Suppose that y € B. 
Then, by the second step of Strategy A2, 


f(g(y)) =y; 


that is, y is the image under f of the element g(y) and g(y) € A, as 
required. Therefore f(A) D B. 


Since f(A) C B and f(A) 2 B, it follows that the image set of f is B (that 
is, f is onto), and so f~! has domain B. 


We now know that each of the functions f~! and g has domain B and 
codomain A. To show that they are equal, it remains to show that 
gly) = ft (y) for each element y of B. 


Let y be an arbitrary element of B. Then y = f(x) for some element x of 
A. So 


f(y) =a, 
and, by the first step of Strategy A2, 
gy) = g(f(&)) = x. 


Hence f~! and g are indeed equal functions. 


3 Functions 


61 


Unit A1 Sets, functions and vectors 


a < 
em 
ae 


Figure 52 The same vector 
represented in different ways 


4 Vectors 


In this section you will revise vectors, in both the plane R? and in 
three-dimensional space R°. Vectors are used throughout Book C Linear 
algebra. 


4.1 What is a vector? 


A mathematical or physical quantity that has a direction as well as a size 
is called a vector, or a vector quantity. An example of such a quantity 
is velocity: to state the velocity of a car you have to give its speed and also 
the direction in which it is moving. In contrast, some mathematical and 
physical quantities, such as temperature and volume, have only a size — 
they have no direction associated with them. We call such quantities 
scalars, or scalar quantities. When discussing vectors and scalars, we 
usually use the term magnitude, rather than size. 


Definition 


A vector is a quantity that is determined by its magnitude and 
direction. A scalar is a quantity that is determined by its magnitude. 


We can represent a vector in R? or in R? geometrically by a line segment 
with an arrowhead, as illustrated in Figure 52. The length of the line 
segment is a measure of the magnitude of the vector, and the direction of 
the arrowhead indicates the direction. The starting point of the line 
segment does not matter; for example, all the line segments with 
arrowheads in Figure 52 represent the same vector. We can draw the 
arrowhead at the end of the line segment, or in the middle of it, as 
convenient. A vector represented by a line segment from A to B, with an 
arrowhead pointing from A to B, can be written as AB. 


Often we use single letters, such as a, b, p, q or v, to denote vectors. 
Vectors are usually distinguished in print by the use of a bold typeface, 
and in handwritten work by underlining the letters (for example, v). These 
are important conventions as they clearly distinguish vector quantities 
from scalar quantities. 


We denote the magnitude of a vector v by the notation |v]. 


There is one vector that does not fit conveniently into the definition above; 
namely, the zero vector. It represents any vector quantity that has 
magnitude zero and hence has no direction, such as the velocity of a 
stationary car. 


Definition 
The zero vector is the vector whose magnitude is zero, and whose 
direction is undefined. It is denoted by the symbol 0. 


The next box defines what it means to say that two vectors are equal. 


Definition 

Two vectors a and b are equal if: 

e they have the same magnitude; that is, |a| = |b| 
e they are in the same direction. 


We write a= b. 


For example, in Figure 53, the vector v is equal to the vector d, but is not 
equal to any of the other vectors, as they all differ from v in magnitude or 
direction. 


hfe N 


Figure 53 A selection of vectors in the plane 


We now briefly revise some other definitions relating to vectors. 


Definition 
The negative of a vector v is the vector that has the same magnitude 
as v, but the opposite direction. It is denoted by —v. 


For example, in Figure 53 we have bee If we write v as AB for 
suitable points A and B, then —v = BA, as shown in Figure 54. 


Scalar multiple of a vector 


Let k be a scalar and v a vector. The scalar multiple kv of v is the 
vector: 


e whose magnitude is |k| times the magnitude of v; that is, 
|v] = |k] [vl 

e that has the same direction as v if k > 0, and the opposite direction 
w e< 0 


If k = 0, then kv = 0. 


4 Vectors 


A BA=-AB 


Figure 54 ‘The vectors v and 
—v 


63 


Unit A1 Sets, functions and vectors 


For example, in Figure 53 we have c = 2v, since c has the same direction 
as v but twice the magnitude, and e = —3f , since e has the opposite 
direction to f and its magnitude is 3 times that of f. 


Exercise A40 


For each of the vectors shown below, decide whether it is a multiple of any 
of the other vectors; if it is, write down an equation of the form vı = kv2 
that specifies the relationship between them. 


Yr 


Xy 


Exercise A41 


For the vector d in Exercise A40, sketch 3d and —2d. 


We can add two vectors using either of the two laws below. They give the 
same result, as illustrated in Figure 55. 


Triangle Law for addition of vectors 

The sum p + q of two vectors p and q is obtained as follows. 

1. Starting at any point, draw the vector p. 

2. Starting from the tip of the vector p, draw the vector q. 

Then the sum p + q is the vector from the tail of p to the tip of q. 


Parallelogram Law for addition of vectors 
The sum p + q of two vectors p and q is obtained as follows. 
1. Starting at the same point, draw the vectors p and q. 


2. Complete the parallelogram of which these vectors are adjacent 
sides. 


Then the sum p + q is the vector from the point where the tails of p 
and q meet to the opposite corner of the parallelogram. 


64 


p 


(a) 
Figure 55 The sum p+ q obtained by (a) the Triangle Law (b) the 


Parallelogram Law 


Addition and scalar multiplication of vectors obey the usual rules of 
algebra. The most important of these are listed in the box below. 


Properties of vector algebra 


Let p, q and r be vectors, and let a,b € R. The following properties 
hold. 


Commutativity p+q=q+p 
Associativity (p+q)+r=p+(q+r) 


Distributivity a(p+q) = ap + aq, 
(a + b)p = ap + bp. 


Finally, we define subtraction of vectors in terms of addition and the 
negative of a vector, as follows, and as illustrated in Figure 56. 


Definition 
The difference p — q of the vectors p and q is 


p—q=p+ (-q). 


Figure 56 ‘The difference p — q of vectors p and q 


Since the vector —q has the same magnitude as q but the opposite 
direction, we can draw p — q by using either of the two constructions that 
we use for adding vectors. 


In general, q — p does not equal p — q; in fact, as you would expect, 


q-p=-(p-q). 


4 Vectors 


65 


Unit A1 Sets, functions and vectors 


66 


Exercise A42 


For the vectors p and q shown below, sketch p+ q, p — q and 2p + $4. 


Yr 


Sy 


4.2 Components and the arithmetic of 
vectors 


We can sometimes simplify the manipulation of vectors by expressing them 
in component form. To do this, we start by defining the following unit 
vectors, shown in Figure 57. A unit vector is a vector of magnitude 1. 


In R?, the vectors i and j are the unit vectors in the positive 
directions of the z- and y-axes, respectively. 


In R?, the vectors i, j and k are the unit vectors in the positive 
directions of the x-, y- and z-axes, respectively. 


YA 

24 

Ji 
oi 
12 7 


Figure 57 The unit vectors i, j and k 


Any vector in R? can be expressed as the sum of scalar multiples of i and 
j, and similarly any vector in R? can be expressed as the sum of scalar 
multiples of i, j and k. For example, the vector v in Figure 58(a) can be 
expressed as 


v = 3i + 4j, 
and the vector w in Figure 58(b) can be expressed as 
w= 2i +4j + 3k. 


These expressions are the component forms of v and w. 


(a) 
Figure 58 (a) A vector v in R? (b) A vector w in R’ 


In general we have the following. 


Definitions 
Any vector p in R? can be expressed in component form as 
p = aji +a2j, for some real numbers aj, ag; 


we often write p = (a1, a2), for brevity. The numbers a, and az are 
the components of p in the z- and y-directions, respectively. 


Any vector p in R? can be expressed in component form as 
p = aji +a2j +a3k, for some real numbers a1, a2, a3; 


we often write p = (a1, a2, a3), for brevity. The numbers a1, a2 and a3 
are the components of p in the z-, y- and z-directions, respectively. 


So, for example, the component form of the vector v in Figure 58(a) is 
31+ 4j, or, equivalently, (3,4). 

Similarly, the component form of the vector w in Figure 58(b) is 
2i + 4j+3k, or, equivalently, (2,4,3). 


In some texts, the ordered pairs and ordered triples that represent the 
component forms of vectors are written vertically, as 


2 


for example, to distinguish them from points. Although we write them 
horizontally in this module, the meaning of an ordered pair or ordered 
triple should be clear from the context. 


Exercise A43 


Sketch the following vectors in R? on a single diagram: 


2i— 3j, —3i+4j, —2i— 2j. 


4 Vectors 


67 


Unit A1 Sets, functions and vectors 


In the box below, the operations on vectors that were described 
geometrically in Subsection 4.1 are expressed in terms of components. The 
component forms of the vectors are expressed as ordered pairs and ordered 
triples in the box; there are analogous formulas for vectors expressed in 
terms of the unit vectors i, j and k. For example, the zero vector in R? can 
be written as 0 = 0i + 0j rather than as 0 = (0,0). 


Vector arithmetic in component form 
Equality Two vectors, both in R? or both in R?, are equal if their 
corresponding components are equal. 
Zero vector ‘The zero vector is 
0=(0,0) in R’, 
0=(0,0,0) in R®. 
Addition To add vectors in R? or in R°, add their corresponding 
components: 
(a1, @2) + (b1, bz) = (a1 + b1, a2 + b2), 
(a1, G2, a3) + (b1, b2, b3) = (a1 + b1, a2 + be, a3 + b3). 
Negatives To find the negative of a vector in R? or in R®, take the 
negatives of its components: 
= (a1, 42) = (—a1, —az), 
= (a1, a2, a3) = (Sar, —a2, Oa). 
Subtraction To subtract vectors in R? or in R?, subtract the 
corresponding components: 


(a1, a2) — (b1, b2) = (a1 — b1, a2 — b2), 
(a1, @2, a3) — (b1, b2, b3) = (a1 — b1, a2 — b2, a3 — 53). 


Scalar multiplication To multiply a vector in R? or in R? by a 
real number k, multiply each component by k: 


k(ay, a2) = (kay, ka), 
Kaine a2, a3) = (kar, kag, kag). 


Magnitude The magnitude of the vector (a1, a2) in R? is 


[rA ee? 
aj + Ce 


The magnitude of the vector (a1, a2, a3) in R is 


hae D i ge 
ay + a3 + a3. 


The formulas for magnitude are derived from the distance formulas for R? 
and R? that you met in Section 1. 


68 


4 Vectors 


Here are some examples of vector arithmetic in component form, in R?: 
the sum of two vectors, 


(1, -3) + (4,2) = (1 + 4,—3 + 2) = (5, —1), 
the negative of a vector, 

re = (-1,3), 
and a scalar multiple of a vector, 

2(2,—1) = (4, —2). 
The magnitude of the vector (1,—3) is given by 


JET CH = VIF8= vio 


Exercise A44 


For each of the following pairs of vectors p and q, write down p +q, ~q 
and p- q. 


(a) p= (3, -1) and q = (—1, —2). 
(b) p= —i-— 2j and q = 2i — J. 
(c) p=-—i+2k andq=si-2j-k. 


Exercise A45 


For each of the following pairs of vectors p and q, determine 2p, 3q and 
2p — 3q, and find the magnitude of q. 


(a) p= (3,—1) and q = (-1, —2). 
(b) p=-—i+2k andq=i-2j-k. 


Unit vectors 


As you saw earlier, a unit vector is a vector of magnitude 1. We denote i 
the unit vector that is in the same direction as a particular vector v by V po 
(read as ‘v hat’), as illustrated in Figure 59. * v 
To find V, we multiply v by the reciprocal of its magnitude, as follows. 
Figure 59 A vector v and its 


The unit vector in the same direction as a vector v is corresponding unit vector V 
Ey il 
v= —v 
[v] 


The exception to this notation for unit vectors is that we use the special 

symbols i, j and k for the unit vectors in the positive directions of the z-, 
y- and z-axes, as you saw earlier. This is common practice, though some 

texts use the alternative symbols X, y and Z for these vectors. 


69 


Unit A1 Sets, functions and vectors 


T T 5 x 
(2, =1) 


Figure 60 The position 
vector of the point (2, —1) 


70 


Worked Exercise A17 


Find v for v = (3,4). 


Exercise A40 


Find v for each of the following vectors v. 
(a) (2,—3) (b) 5i+ 12j 


Position vectors 


There is a natural and useful way to associate every point in the plane or 
in three-dimensional space with a vector. We make the following definition. 


Definition 

Let P be any point in R? or R°. The position vector of P is the 
vector whose starting point is the origin and whose finishing point 
is P, that is, the vector OP, where O is the origin. 


For example, the position vector of the point P(2,—1) is the vector 
OP =A- j (often written as (2,—1)), as shown in Figure 60. 


In general, any point (x,y) in R? has position vector xi + yj (often written 
as (x, y)), and similarly any point (x,y,z) in R? has position vector 
xi + yj + zk (often written as (2, y, z)). 


Exercise A47 


Let p and q be the position vectors of the points (5,3) and (1,4), 
respectively. 


(a) Determine the vectors p — q, p + q and $p + 54. 


(b) Sketch p, q and each of the vectors that you found in part (a), 
starting each vector at the origin. 


The following simple result about position vectors is often useful. 


Let A and B be points (in R? or RÌ), with position vectors a and b, 
respectively. Then 


AG = bea 


To see this, let O be the origin, as shown in Figure 61. Then 
AB = AO + OB (by the Triangle Law for vector addition) 


-OÀ + 0B 


= —a +b 
= b-a, 


as claimed. 


The sets R? and R? 


Finally, we clarify some issues about the sets R? and R?. You have seen 
that we use the notation R? to denote the plane, and the notation R° to 
denote three-dimensional space. Strictly, the meaning of these notations is 
as follows: 

R? = {(x,y) :2,y € R}, 

R? = {(x,y,z) : x,y,z E R}. 
That is, R? is the set of all ordered pairs of real numbers, and R? is the set 
of all ordered triples of real numbers. We interpret these sets as the plane 
and as three-dimensional space, respectively, by interpreting their elements 
as the coordinates of points with respect to particular coordinate systems, 
in the way that you have seen. 


However, it is often useful to instead interpret the elements of R? and R? 
as vectors. For example, we can interpret the element (2,—1) of R? not as 
the point with coordinates (2,—1), but instead as the vector with 
component form (2,—1). 


We can use whichever interpretation of R? and R° is more useful in a 
particular context. A link between the two interpretations is provided by 
position vectors, because the vector with component form (x,y) is the 
position vector of the point with coordinates (x,y), and similarly the 
vector with component form (x,y,z) is the position vector of the point 
with coordinates (x,y, z). 


This link also makes it straightforward to represent a particular point not 
by coordinates, but by a vector: we use its position vector. It might seem 
that this amounts to much the same thing, but the advantage of 
representing points by vectors is that it enables us to use the properties of 
vectors to work with points. This leads to some very convenient ways of 
working with points, as you will see in the next subsection and again in 
Subsection 4.5. In Book C you will see how generalising all these ideas 
leads to some interesting and very useful mathematics. 


4 Vectors 


Figure 61 Points A and B 
and their position vectors 


71 


Unit A1 Sets, functions and vectors 


Figure 62 A point R on the 
line 1 


Figure 63 The position of R 
determined by A 


72 


4.3 Vector form of the equation of a line 


In Subsection 1.1, we found that every line in the plane has an equation of 
the form 


ax + by =c, 


where a, b and c are real numbers, with a and b not both zero. In this 
subsection we find an equivalent general form for the equation of a line in 
terms of vectors. Unlike the equation above, this vector form applies to 
lines in R? as well as in RÊ, as you will see later in this subsection. 


Let P and Q be points with position vectors p and q, respectively, and 

let l be the line that passes through P and Q, as illustrated in Figure 62. 
We now find an expression for the position vector r of an arbitrary point R 
on / in terms of the position vectors p and q. 


Since the vector PR is parallel to the vector PO, it must be a multiple of 
PO, that is, 


PR = APQ, 


for some real number àA. Now, by the result about position vectors given at 
the end of the last subsection, we have 


PR=r-—p and PO=q-p. 
So 
r—p=XA(q-p). 
We can rearrange this equation as 
that is, 
r= (1—A)p+ àq. (3) 


This is a general formula for the position vector of a point on the line 
through P and Q, in the following sense: each point on l corresponds to a 
particular value of A, and vice versa. As shown in Figure 63, we have the 
following. 


e Jf A = 0, then r = 1p + 0q = p. 


e If A= 1, then r = 0p + 1q = q. 

e Jf à > 1, then R lies on l beyond Q. 

e If0<A <1, then R lies on l between P and Q. 
e If \ <0, then R lies on l beyond P. 


So we can regard equation (3) as the vector form of the equation of the 
line l. 


Vector form of the equation of a line 


The equation of the line through the points with position vectors p 
and q is 
r=(1—A)p+ Aq, where AER. 


Note in particular that when \ = = in the equation above, we have 
r= ip + $q = $(p +q), which is the position vector of the midpoint of 
the line segment PQ. 


Worked Exercise A18 


(a) Let P and Q be the points with position vectors p = (1,3) and 
q = (—1, —2), respectively. Find the vector form of the equation of 
the line l through P and Q. 


(b) Determine whether the point (3,8) lies on J. 


Exercise A48 


Let P and Q be the points with position vectors p = (3,1) and q = (2,3), 
respectively. Let l be the line through P and Q. 


(a) Find the vector form of the equation of the line l. 


(b) Determine the three points on l whose position vectors are given by 
the equation you found in part (a) when A takes the values 4, 2 and 
—5, respectively. 

(c) Ona single diagram, sketch P, Q, the line / through P and Q, and 
the three points that you found in part (b). 


4 Vectors 


73 


Unit A1 Sets, functions and vectors 


74 


Exercise A49 


Let P, Q and l be as in Exercise A48. 


(a) Determine the value of \ corresponding to the point (4, —1) in the 
vector form of the equation of l. 


(b) Use the vector form of the equation of l to prove that the point (4, 5) 
does not lie on J. 


In the vector form of the equation of a line, there is no assumption that p 
and q are position vectors of points in R?: they may equally well be 
position vectors in R. 


Worked Exercise A19 


(a) Let P and Q be the points with position vectors p = (1, 2,3) and 
q = (3, —2, 1), respectively. Find the vector form of the equation of 
the line / through P and Q. 


(b) Determine whether the point (4,—4,0) lies on the line J. 


Exercise A50 


(a) Let P and Q be the points with position vectors p = (2,1,0) and 
q = (1,0, —1), respectively. Find the vector form of the equation of 
the line / through P and Q. 


(b) Determine the points on l whose position vectors are given by the 
equation you found in part (a) when A takes the values 5 and —1. 


4 Vectors 


4.4 Scalar product 


In this subsection you will meet a way of combining two vectors, known as 
the scalar product or dot product, which is useful in linear algebra, as you 
will see in Book C. 


The definition of the scalar product is given below. It applies to vectors in 
both R? and R. 


Definition 
If u and v are non-zero vectors in R? or R?, then the scalar product 
(or dot product) of u and v is 


u- v = |u||v| cos 8, 
where 0 is the angle between u and v. 


If one or both of u and v is the zero vector, then u- v = 0. 


The scalar product of two vectors is a scalar, hence the name. 


Note that the angle between two vectors is defined to be the angle @ in the 
range 0 < 0 < m between their directions when the vectors are placed to 
have the same starting point (not necessarily the origin), as illustrated in 
Figure 64 for vectors in R? and R. 


` y 
(a) (b) & 


Figure 64 The angle 0 between two vectors u and v in (a) R? and (b) R 


3 


Let us use the definition of the scalar product to calculate the scalar 
product u » v of the vectors u = (2,0) and v = (3,3) in R?, which are 
shown in Figure 65. We have 


|u| = 2 


v = (3,3) 


£ 


u = (2,0) 


lv| = V32 +32 = V2 x 32 = 3v2. Figure 65 The vectors 


The angle 0 between the vectors u and v is 7/4. Hence ep) OR 8) 


and 


u. v = |ul|v|cosé 


= 2x 3V2 x cos 7 


1 
SOR 


=ô. 


75 


Unit A1 Sets, functions and vectors 


Figure 66 A vector v 


< 


Figure 67 Perpendicular 
vectors u and v 


76 


There is an easier way to calculate the scalar product of two vectors, which 
does not depend on knowing the angle between them, but just involves 
their components. You will meet this method shortly, but first we will use 
the definition of the scalar product to derive some of its properties. 


To start with, consider any vector v in R? or R?, as illustrated in 
Figure 66. Let us find the scalar product of v with itself. The angle 
between v and itself is 0, so we have 


v-v=|v||v|cos0 = |v|? x 1 = |v|?. 


This gives the following property. 


Magnitude of a vector in terms of scalar product 


For any vector v in R? or R, 


lv] = Vv-v. 


Now consider any two vectors u and v in R? or R? that are at right angles 
to each other, as illustrated in Figure 67. Their scalar product is 


T 
u- v = |u||v| cos A ju||v| x 0 = 0. 
So the scalar product of any two perpendicular vectors is 0. 


A converse of this result also holds. Suppose that u and v are vectors in 
R? or R? whose scalar product is 0. Then, by the definition of the scalar 
product, 


|u||v| cos 8 = 0, 
where @ is the angle between u and v. It follows that 
jul =O or |v|=0 or cos = 0, 
and hence 
T 
u=0 o v=0 or taz 
So we have the following property. 


Scalar product and perpendicularity 
Let u and v be vectors. 
e If u and v are perpendicular, then u- v = 0. 


e Ifu-v = 0, then u = 0, or v = 0, or u and v are perpendicular. 


4 Vectors 


Finally, the scalar product has the following algebraic properties. 


Algebraic properties of scalar product 


Let u, v and w be vectors in R? or R3, and let a € R. The following 
properties hold. 


Commutativity u-v=v-u 
Multiples (au). v =u- (av) = a(u-v) 


Distributivity u-(v+w)=u-v+u-w, 
(ut+v)-w=u-wt+v-w. 


Note that the distributive properties in the box also hold if the plus signs 
are replaced by minus signs. This follows by combining the distributive 
properties with the multiples property for a = —1. 


The properties of the scalar product in the box can be proved by using the 
definition of the scalar product. 


The commutative property follows immediately from the definition. 


To see why the multiples property holds, let u and v be two vectors in R? 
or RÌ, and first suppose that a is a positive constant. If the angle between 
u and v is 0, then the angle between au and v is also 0, as illustrated in 
Figure 68, so 


(au) + v = jaul|v| cos 8 


= |a||u||v| cos 0 


= a|u||v| cos 8 (since a is positive) Figure 68 . Veciois th aiid vy, 
=a(u-v), and a scalar multiple au of u, 
and, similarly, where q is positive 


u: (av) = |ul|av| cos 8 
= |u||a||v| cos 0 
= q|u||v| cos 8 
=a(u-v). 
The multiples property can be proved in the case where a is negative in a 
similar way. In this case the angle between au and v, and also the angle 


between u and av, is m — 0, but cos(a — 0) = — cos 0 by the properties of 
the cosine function (see the module Handbook). 


The proof of the distributive properties is more complicated, and the 
details are omitted here. 


77 


Unit A1 Sets, functions and vectors 


Using the properties of the scalar product given above, we can prove the 
following simple formulas for calculating the scalar product. 


Scalar product of vectors in component form 


In R?, let u = (21, y1) and v = (x2, y2). Then 


WL ONY = SPIED ar VND 


In R®, let u = (21, y1, 21) and v = (22, y2, 22). Then 


uU- v = tiz + Yy + 2122. 


Here is a proof of the formula above for vectors in R?. The proof for 
vectors in R3 is similar, but longer. 


Let u and v be vectors in R?. We write them in component form as 
u=2i+yj and v=22i+ yj, 
as shown in Figure 69 below. 


y 


v = (£2, Y2) = Lait yoj 


j = (0,1) l ; 
u = (21,41) = zi + yj 


i= (1,0) z 
Figure 69 The vectors u and v in component form 


This gives 
u-v = (xii + yij) + (wei + y2) 
= (xii + yj) x2i + (x11 + yj) + y2j (by distributivity) 
xii. gai + yij + T2 + rii- y2j + yj yal 
(by distributivity) 
= giti İ + yT) i+ ziyi- j + y1y2j'j 
(by the multiples property). 


Now i and j have magnitude 1, so by the formula for the magnitude of a 
vector in terms of scalar product, given earlier, 


i-i=1?=1 and j-j=1°=1. 

Also, i and j are perpendicular, so 
i-j=j-i=0. 

Hence 
u-vV=271%2X l +yz xX 0 + z1y2 X 0O+ y1y2 x 1 


= it + Y1Y2, 
as claimed. 


78 


Worked Exercise A20 


Calculate the following scalar products. 
(a) (3,3)-(2,0) (b) (21+3j)- (21-5) (c) (V2, -4) - (2V2,1) 
(d) (1; =; 1) ° (1, =l; 1) 


Worked Exercise A20(a) is the particular scalar product that was 
calculated using the original definition near the start of this subsection. 


Notice that the result of Worked Exercise A20(c) shows that the vectors 
(V2, —4) and (2V2, 1) are perpendicular, something that is not 
immediately obvious when we look at their component forms. 


Exercise A51 


Calculate the following scalar products. 
(a) (2,3)*(8,-4) —(b) (1,4)*(2,-4) (©) (28 + 5) - (BE - 23) 
(da) (1,-—1, —2) - (3, —2,—5) 


One useful application of the scalar product is that it provides a method 
for finding the angle between two vectors, as illustrated in Figure 70. The 
formula below is obtained by rearranging the original definition of the 
scalar product. 


Angle between two vectors 


The angle 0 between two vectors u and v is given by 


cos o = A 
u||v| 


4 Vectors 


v 


Figure 70 Two vectors u and 
v, and the angle 0 between 
them 


79 


Unit A1 Sets, functions and vectors 


D 
[e 
II 
ens. 
ae 
| 
N 
So 
Xy 


Figure 71 The vectors 
u = (4, —2) and v = (9,3) 


u = (3,1, —1) 


Figure 72 The vectors 
u = (3,1,—1) and v = (1,2,3) 


80 


In the next worked exercise this formula is used to find the angle between 
two vectors in R?. 


Worked Exercise A21 


Find the angle @ between the vectors u = (4, —2) and v = (9,3), in 
radians. (These vectors are shown in Figure 71.) 


Exercise A52 


Find the angle between the vectors in each of the following pairs of vectors, 
in radians. Give your answer to two decimal places unless it is an obvious 
multiple of 7. 


(a) (1,4), (5,2) (b) (—2,2), (1,-1) (e) 91 — 2j, i+ 2j 


The formula for the angle between two vectors works equally well in RÌ, as 
is shown in the next worked exercise. 


Worked Exercise A22 


Find the angle 6 between the vectors u = (3,1,—1) and v = (1, 2,3), in 
radians to two decimal places. (These vectors are shown in Figure 72.) 


Exercise A53 


Find the angle between the following pairs of vectors, in radians to two 
decimal places. 


(a) (3,4,5), (1,0, —1) 
(b) 2j- 3k, —i— j — 2k 


4.5 Equation of a plane in R 


In Subsection 1.1 you saw that the general form of the equation of a line in 
R? is ax + by = c, where a,b,c € R, and a and b are not both zero. We can 
use the scalar product to derive a similar general form for the equation of a 
plane in R?, as you will see in this subsection. In doing this, we will also 

derive a general form for the equation of a plane in R? in terms of vectors. 


First, let us look at some planes in R? whose equations are easy to find. ZA 
The ‘simplest’ planes in R? are the three planes that contain a pair of axes. 
The (x, y)-plane is the plane that contains the x- and y-axes, as 
illustrated in Figure 73. The (x, z)-plane and the (y, z)-plane are defined 
similarly. The points that lie in the (x, y)-plane are the points (x,y,z) in 
R3 for which z = 0, so the equation of the (x, y)-plane is 


z=0. 


T 


4 Vectors 


Figure 73 The (x, y)-plane 


81 


Unit A1 Sets, functions and vectors 


> 


à 
\ 


Figure 75 Parallel planes 


82 


Exercise A54 


Write down the equations of the (y, z)-plane and the (z, z)-plane. 


Exercise A55 


Sketch the planes whose equations are as follows. 
(a) z=2 (b) y=-1 


Before we derive the general equation of a plane in R°, we need the 
following concept. 


Definition 
A vector that is perpendicular to all the vectors in a particular plane 


is called a normal vector (or simply a normal) to the plane. Its 
direction is said to be normal to the plane. 


Figure 74(a) shows some normal vectors to a plane. If n is a normal vector 
to a particular plane, then so is kn, for any non-zero real number k. If 

k > 0, then kn is in the same direction as n, whereas if k < 0, then kn is 
in the opposite direction to n, as illustrated in Figure 74(b). 


VA \\ 
\ \ i 


(a) (b) kn (k < 0) 
Figure 74 Some normal vectors to a plane 


Any vector n in R? is a normal vector to infinitely many planes, all parallel 
to each other, as illustrated in Figure 75. 


We can specify any particular plane in R? by specifying a normal vector to 
the plane, together with a point that lies in the plane. For example, there 
is exactly one plane that contains the point P(2,3,4) and has 

n = (1,2,—1) as a normal. 


Here is how we can find an equation for this particular plane. A condition 
for an arbitrary point X (x,y,z) in R3 to lie in the plane is that the vector 


PX must be perpendicular to the normal vector n, as illustrated in 
Figure 76. In other words, we must have 


PX n= 


Now 
PX =x-—p (where x and p are the position vectors of X and P) 
= (č; Y, z) ~ (2, 3, 4) 
= (x —2,y—3,z-4). 
Hence the condition for the point X(z, y, z) to lie in the plane is 
(x — 2,y — 3,z—4) . (1,2,—1) — 0, 
that is, 
(a —2)x 1+ (y—3) x 24+ (z-4) x (-1) =0, 
which simplifies to 
e+2y—z=4. 
This is the equation of the plane. 
In fact every plane in R? has an equation of the form 
ax + by + cz = d, 


for some real numbers a, b, c and d. To prove this, we apply the argument 
above to a general plane. Consider the plane that contains the point 
P(x, y1, 21) and has n = (a,b,c) as a normal vector, as illustrated in 
Figure 77. 


n = (a,b,c) 
KO) 
PAL, Wia 21) 


Figure 77 An arbitrary point X(z, y, z) on the plane containing the point 
P(a1,y1, 21) with normal n = (a, b,c) 


4 Vectors 


XD, Une) 


P(2,3,4) 
n = (1,2,—1) 


Figure 76 An arbitrary point 
X (x,y,z) on the plane 
containing the point P(2, 3, 4) 
with normal n = (1,2, —1) 


83 


Unit A1 Sets, functions and vectors 


84 


A condition for an arbitrary point X (x,y,z) in R? to lie in this plane is 
that the vectors PX and n must be perpendicular, that is, 


PY nae. 


Since PX = x — p, where x and p are the position vectors of X and P, 
respectively, this condition can be written as 

(x—p)-n=0. 
By the algebraic properties of the scalar product, we can write the 
condition as 

x-n—p:n=0, 
that is, 

xX*n=p-n. 
This is the vector form of the equation of the plane. Alternatively, we can 


write it in terms of the coordinates x, y and z, by substituting for 
x = (x,y,z), n = (a,b,c) and p = (24, y1, 21). Then the equation becomes 


(tyz) A (a,b, č) = (21591; 21) j (a, b, c); 


that is, 
ax + by + cz = ax, + byi + €21. 


This equation is of the form 


ax + by + cz =d, 


where d is the real number given by d = ax, + by; + cz1. So we have shown 
that every plane in R? has an equation of this form, for some real numbers 
a, b, c and d. 


Equation of a plane in R° 
The equation of the plane that contains the point (x1, yi, 21) and has 
the vector n = (a,b,c) as a normal is 
ax + by + cz = d, 
where d = ax, + by, + cz. 


This equation can be written in vector form as 
x-n=p'n, 


where x = (0m2) and p = (1, y1, 21). 


Once we know the equation of a plane in the form ag + by + cz = d, we can 
‘read off’ the components of a normal vector, as they are the coefficients of 
x, y and z in the equation. For instance, one normal to the plane with 
equation x — 2y + 3z = 7 is n = (1,—2,3). Note that the zero vector can 
never be a normal since its direction is undefined. 


When we want to find the equation of a plane, it is simpler to start from 
the vector form of the equation, as demonstrated in the next worked 
exercise. 


Summary 


Worked Exercise A23 


Determine the equation of the plane in R? that contains the point 
(1,—1,4) and has the vector (2, —2,3) as a normal. 


Exercise A56 


Determine the equation of each of the following planes. 


(a) The plane that contains the point (1,0,2) and has the vector (2,3, 1) 
as a normal. 


(b) The plane that contains the point (—1,1,5) and has the 
vector (4, —2,1) as a normal. 


In Book C you will see how you can find the equation of a plane in R3 if 
you know three points on the plane, rather than a point and a normal. 


Summary 


In this unit you have studied some fundamental ideas in mathematics. You 
have met a new notation for specifying sets and encountered examples of 
sets of numbers and sets of points. You have studied the operations of 
union, intersection and difference that can be performed on sets, and seen 
how to show that two sets are equal. You have also met many examples of 
functions between sets, and seen that a one-to-one function has an inverse. 
Finally, you have worked with vectors and seen how to carry out vector 
arithmetic in component form and use the scalar product of two vectors. 


85 


Unit A1 Sets, functions and vectors 


86 


Throughout the unit you have worked especially with the sets R, R? 

and R, of real numbers, ordered pairs of real numbers and ordered triples 
of real numbers, respectively. You have seen that the elements of these sets 
can be regarded geometrically as points on the real line, in the plane and 
in space, and that points in R? or R? can also be identified with their 
position vectors. 


You will continue your study of foundational mathematical concepts in the 
rest of Book A, and the ideas you meet here will be in constant use 
throughout this module. 


Learning outcomes 


After working through this unit, you should be able to: 
e recognise the equation of a line and the equation of a circle in R? 
e use set notation and the notation of intervals of the real line 


e determine whether one set is a subset of another, and whether two sets 
are equal 


e find the union, intersection and difference of two sets 

e define a function and its domain, codomain and rule 

e determine the image set of a function 

e determine whether a function is one-to-one and/or onto 


e find the inverse of a one-to-one function, and the composite of two 
functions 


e explain what are meant by a vector, a scalar, a scalar multiple of a 
vector, and the sum and difference of two vectors 


e represent vectors in R? and R? in terms of their components, and carry 
out vector arithmetic using components 


e determine the equation of a line in R? or R in terms of vectors 


e explain what is meant by the scalar product of two vectors, and use it to 
find the angle between two vectors 


e recognise the equation of a plane in RÌ, and the vector form of the 
equation 


e determine the equation of a plane in RÌ, given a point in the plane and a 
normal to the plane. 


Solutions to exercises 


Solution to Exercise Al 


Using the formula for the equation of a line when 
given its gradient and one point on it, we find that 
the equation of this line is 


y — (-1) = —3(a — 2). 
We can rearrange this to 
y= —3xr + 5, 
or 
3r +y = 5. 


Solution to Exercise A2 


(a) Since (1,1) and (3,5) lie on the line, its 
gradient is 
1-5 

m= = =2 
Then, since the point (1,1) lies on the line, its 
equation must be 

y-1=2(2—1), 
so 

y=2x-—1, or 2x-—y=1. 
(b) Both these points have xz-coordinate 0, so they 
lie on the line with equation x = 0, the y-axis. 


(c) Since the origin lies on the line, its equation 

must be of the form y = mg, where m is its 

gradient. 

Since (4, 2) lies on the line, its coordinates must 

satisfy the equation of the line. Thus 2 = 4m, so 
_i 

Hence the equation of this line is y = $x, or 

5x —y =0, or g= 2y = 0. 

(d) Both these points have y-coordinate —1, so 

they lie on the line with equation y = —1. 


Solution to Exercise A3 


We can rearrange the equations of the lines to find 
their gradients as follows: 


l: y =—2r + 4 b: y=2x+$ 
l3: y=-—4r+5 l: y=4r- 5 
ls: y=4r+1 Ig: y=- 


Solutions to exercises 


Thus the gradients of the given lines are —2, 2, 


7 7 4 and —2, respectively. 


It follows that the lines lı and le are parallel, since 
their gradients are the same but their y-intercepts 
are different. Similarly, /4 and /5 are also parallel. 


Lines lı and l4 are perpendicular, since the product 
of their gradients is —1. For the same reason, each 
of the following pairs of lines are perpendicular: 

l and l5; l2 and l3; l4 and lg; and l5 and lę. 
Solution to Exercise A4 


We use the formula for the distance between two 
points in the plane. This gives the following 
distances. 


(c) /(—1?+0-2)=Vir7 


(da) ,/(-1 — 3)? + (4 — (-8))? = V160 

= 4/10 
(The two points in part (a) are on the x-axis, so in 
fact there is no need to use the distance formula to 
find the distance between them.) 


Solution to Exercise A5 
(a) This circle has equation 
(z - 0} + (y - 0} = 4", 
which can be simplified to give 
a? +y = 16. 
(b) This circle has equation 
(« — (—1))* + (y- 0)? = (v2)’, 
which can be simplified to give 
(+1)? +y =2. 
(c) This circle has equation 
(a — 8)? + (y — (-4))? =F, 
which can be simplified to give 


(x — 3)? + (y+4)? =4. 


87 


Unit A1 Sets, functions and vectors 


88 


Solution to Exercise A6 


ZA 


27- i (0, 1, 2) 


Solution to Exercise A7 


We use the formula for the distance between two 
points in R*. This gives the following distances. 


(a) (4—1) + (1—1) + (-3- 1)? 


=/F 204 BS} 
(b) «/G= 12+ 0— 
= /44+440 =2V2 


Solution to Exercise A8 

(a) True: —3 is an integer. 

(b) False: 5 is a natural number. 

(c) False: 1.3 is the rational number B, 

(d) True: both 1 and 3 are rational numbers. 
(e) True: —7 is a real number. 

(£) False: 5 is not a natural number. 


(g) False: 1 is a non-zero real number, but 0 is 
not. 


(h) False: v2 is a real number. 


Solution to Exercise A9 

(a) True: 1 is a member of the given set. 

(b) True: the set {—9} is a member of the given 
set, although the number —9 is not. 

(c) False: the number 9 belongs to the given set, 
but the set {9} does not. 


(d) False: the point (0,1) is not a member of the 
given set of points in R?, although the point (1,0) 
is. 

(e) False: the numbers 1 and 0 are not members 


of the given set of points in R?, although the point 
(1,0) is. 


(f) True: the set {1,0} is the same as the set 
{0,1}, and so is a member of the given set. Notice 
that the members of this set are themselves sets, 
and not points in R?. 


Solution to Exercise A10 
(a) True: > is in R, and it satisfies the condition 
T> 3. 


(b) True: 7 = 3 x 2+ 1, so 7 is of the form 3k + 1 
for some k € Z. 

(c) False: — is not in Z. 

(d) False: 8 cannot be expressed as 2” for some 


number z € R satisfying 0 < x < 2; in fact 8 = 2°. 


(e) True: 9 is in Z, and 9 = 3? so 9 = k? for some 
kez. 


(£) True: 6 = 3(3 — 1), so 6 is of the form 
m(m — 1) for some m E N. 


(g) False: 4 is an even integer, but it does not 
satisfy 0< r < 4. 


Solution to Exercise A11 
(a) {k €Z :—2< k< 1000} 


(b) {x € Q: z> 0 and z? > 2} or 
{rEQ:x>0, 2? >2} 


(c) {2n:ne€N} 
(d) {2*:k eZ} 


Solution to Exercise A12 


(a) False: the set (1,5) is an open interval and 
does not include the endpoint 1. 


(b) True: the set (—1, 1] is half-open, with the 
upper endpoint 1 included. 


(c) False: co does not denote a number and so is 
not in the interval. 


(d) True: R* denotes the set of non-zero real 
numbers, so 0 is not a member of this set. 


(e) False: x € R* means z is a non-zero real 
number, while (0,00) comprises just the positive 
real numbers. For example, the number —1 is in 
R*, but not in (0,00). 


Solution to Exercise A13 
(a) [-11,2) 

(b) (—6.5,21] 

(c) (—273, 00) 


Solution to Exercise A14 
(a) l= {(x,y) E€ R? : y = 2x + 5} 


(There are other ways to specify this line; another 
example is l = {(x, 2x +5) : x € R}.) 


(b) The line l has equation y = 1 — z, so it is as 
follows. 


(c) The line l has equation y = x (since here 
m = 1 and c = 0), so it is as follows. 


Solutions to exercises 


Solution to Exercise A15 
(a) C = {(z,y) E€ R? : (z — 1)? + (y +4) =9} 
(b) The circle has centre (1,3) and radius 2. 


YA 


14 


Solution to Exercise A16 


(a) This set is a half-plane with the boundary line 
excluded, as follows. 


YA 


| 

| 

| 

| 

| 

jc=1 

| 

| > 
T 

| 


1 


(b) This set is another half-plane, but this time 
the boundary line is included, as follows. 


89 


Unit A1 Sets, functions and vectors 


(c) This set is a disc with the boundary excluded, 


as follows. 
Y (@ 1)? +(y—2)? =4 
y 
/ 2/ \ 
| | 
\ / 
ae. 
hl á 
= T 


(d) This set consists of the points outside a disc 
with centre (0, —3) and radius 1, as follows. 


YA 


Ry 


“a ~ 
N 
N] \ 
| | 
> | 


Solution to Exercise A17 
{(2,y) €R?:0<2<2,1<y<3} 


y 


Solution to Exercise A18 


(a) The set B consists of the solutions of the 
equation 


r’ +r—6=0, 
which we can write as 

(x — 2)(x +3) =0. 
So B = {2,-3} =A. 


90 


(b) The two sets are 
A= {k€ Z: kis odd and0 < k < 8} 
= 41;3,.5,7}, 
B={2n+1:n €N and n? < 25} 
= {3,5,7,9}. 
Hence A Æ B, either because 9 € B but 9 ¢ A, or 
because 1 € A but 1 ¢ B. 


Solution to Exercise A19 


(a) Each element of A is a point in R2. 


We calculate x — 4y using the coordinates of each 


point of A: 
5-4x2=-3, 
1-4x1l=-3, 
—3 — 4 x 0 = —3. 


This shows that each element of A is an element of 
B,so AC B. 


(b) The sets A and B are sketched below. 
YA 


B 


The set A is the interior of the unit circle, and B is 
the half-plane consisting of all points with negative 
y-coordinate. So A É B, because, for example, the 


point (5, 5) belongs to A but not to B. (Any one 
point that is in set A but not in set B shows that 
AtB.) 

(c) Let x be an arbitrary element of A; then z € R 
and satisfies —1 < x < 0. This equation gives 


—1+1<zr+1<0+1, 
that is, 

O<¢4+1<1. 
Hence 

0<(2£+1) <1, 
so x E€ B. 


Since x is an arbitrary element of A, we conclude 
that AC B. 


Solution to Exercise A20 


(a) We showed that A C B in the solution to 
Exercise A19(a). Also, for example, the point (9, 3) 
lies in B, since 

9-4x3=-3, 
but does not lie in A. Therefore A is a proper 
subset of B. 


(b) We showed that A C B in the solution to 
Exercise A19(c). Also, for example, —2 lies in B, 
since 


(-2+1)? = (-1} =1, 


but does not lie in A. Therefore A is a proper 
subset of B. 


Solution to Exercise A21 
(a) First we show that A C B. 
Let (x,y) € A; then (x,y) € R?, and for some 
t € R, we have x = t and y = 2t. Hence 
y = (2)? = 4t? = 4r. 
So (x,y) € B, and AC B. 
Next we show that B C A. 


Let (x,y) € B; then y? = 4x. We must show that 
there is a value of t in R such that z = t? and 

y = 2t, so that (x,y) € A. Let t be given by y = 2t; 
that is, t = SY: Then, since 4x = y?, we have 

g= ty’, and substituting for y gives 


1 
= (W)? =??. 
x 78 


Hence (x,y) = (t, 2t) € A, and so BC A. 
Since A C B and B C A, it follows that A = B. 
(b) First we show that A C B. 
Let (x,y) € A; then 2z +y — 3 = 0. We must show 
that there is a value of t in R such that x = t+ 1 
and y = 1 — 2t. Let t be given by x = t +1, that is, 
t= x -— 1. Then, since 2a + y — 3 = 0, we have 
y=3 = 2r 

=3—2(t+1) 

=] = 26: 
Hence (x,y) = (t +1,1 — 2t) € B, and so A C B. 
Next we show that B C A. 


Solutions to exercises 


Let (x,y) € B; then (x,y) € R?, and for some 
t € R, we have x = t + 1 and y = 1 — 2t. We must 
show that (x,y) satisfies 2x + y — 3 = 0. Now 


Qn +y —3 = 2(t+1) 3 
= 0, 
as required, so (x,y) E€ A. Therefore B C A. 
Since A C B and B C A, it follows that A = B. 


Solution to Exercise A22 
(a) (1,7) U [4,11] = (1,11. 


(b) R* denotes the set of non-zero real numbers, 
and so is the union of the two intervals (—oo, 0) 
and (0,00); that is 


R* = (—oo, 0) U (0, ov). 


(c) The union of the half-plane and disc is 


YA 
=2 2 s 
—-21 HUD 


Solution to Exercise A23 
(a) (1,7) [4,11] = [4, 7). 


(b) The intersection is 


YA 
24 
—2 2 
\ & 
= HAD 


Solution to Exercise A24 


(a) (1,7) — [4,11] = (1,4) and 
A=) = ira, 


91 


Unit A1 Sets, functions and vectors 


92 


(b) The two differences are 


YA YA 


D-H 
2- 2 
—~2 2 
N |Z 
je a) -2 4 


Solution to Exercise A25 


(a) This is the translation of the plane that moves 
each point to the right by 2 units and up by 
3 units. 


(b) This is the reflection of the plane in the x-axis. 


(c) This is the rotation of the plane through 7/2 
anticlockwise about the origin. 


Solution to Exercise A26 
Only diagram (b) represents a function. 


Diagram (a) does not represent a function, as there 
is no arrow from the element 3. 


Diagram (c) does not represent a function, as there 
are two arrows from the element 1. 


Solution to Exercise A27 

(a) f(S) = {f(0), f0), f(2), f(3)} 
= {—1,0, 1,2}. 

(b) F(Z) = {..., KDI Os Ht 
= {...,—2,—1,0,1 ...} 
=Z. 


Solution to Exercise A28 

The images of the elements of A are 
f(0) =9, f1)=8, f(2)=7, f(3)= 6, 
f(4) =5, f(5) =4, f(6)=3, f(7)=2, 
f(8) =1, f(9) =0. 


So the image set of f is 
{0, 1,2, 3,4,5,6,7,8,9} =A. 
Solution to Exercise A29 


Only diagram (a) represents an onto function. 


Diagram (b) does not even represent a function, as 
there is no arrow from the element 4. 


Diagram (c) represents a function that is not onto, 
as there is no arrow going to the element 1. 


Solution to Exercise A30 


(a) The sketch of the graph of f below suggests 
that f (IR) = [1, 00). 


y 


y=1+2? 


> 
| x 


We prove that f(R) = [1, 00). 

Let x € R; then f(z) =1+ 27. Since z? > 0, we 
have 1+ 2? > 1 and so f(R) C [1, 00). 

We must show that f(R) 2 [1, 00). 

Let y € [1,oo). We must show that there exists 

x € R such that f(x) = y; that is, 1+ £? = y. Now 
y — 1 is real, since y > 1, and satisfies 

f(x) = y, as required. (Alternatively, r = —/y—I 
is real and satisfies f(x) = y.) 

Thus f(R) 2 [1, 00). 

Since f(R) C [1,co) and f(R) 2 [1, 00), it follows 
that f (IR) = [1, 00), so the image set of f is [1, 00), 
as expected. 

The interval [1, 00) is not the whole of the 
codomain R, so f is not onto. 


tC —=<— 


(b) This function is the reflection of the plane in 
the x-axis. This suggests that f (R?) = R?. 


We know that f(R?) C R?, so we must show that 
JR?) 2 R?. 

Let (2’,y’) € R?. We must show that there exists 
(x,y) E€ R? such that f(z, y) = (2’,y’); 

80 (x, —y) (2, y"), that is, 


r=, -y=y'. 


Rearranging these equations, we obtain 


So, (x,y) € R? and f(x,y) = (a’,y’), as required. 
Thus f(R?) D R?. 


Since f (R?) C R? and f(R?) D R, it follows that 
f (R?) = R?, so the image set of f is R?, 
as expected. 


The codomain of f is also R?, so f is onto. 


Solution to Exercise A31 
Only diagram (c) represents a one-to-one function. 


Diagram (a) represents a function that is not 
one-to-one, as there are two arrows going to the 
element 3. 


Diagram (b) does not even represent a function, as 
there is no arrow from the element 2. 


Solution to Exercise A32 


(a) This function is not one-to-one since, for 
example, 


f(2) = f(-2) =14+4=5. 


(b) This function is the reflection of the plane in 
the x-axis, so we expect it to be one-to-one. We 
now prove this algebraically. 


Suppose that f(x1, y1) = f (x2, ye); then 
eae —y1) = (£2, —y2). 


This means that zı = z2 and —y, = —y2,. It 
follows that y1 = y2, so we have shown that 
(21, y1) = (2, Y2), that is, f is one-to-one. 


Solution to Exercise A33 


(a) In Exercise A32 we saw that f is not 
one-to-one, so f does not have an inverse function. 


(b) In Exercise A32 we saw that f is one-to-one, 
so f has an inverse function. 


In Exercise A30 we saw that the image set of f is 
R? and, for each (x', y’) € RÊ, we have 


(x,y) = f(a’, -y’). 
So f~t is the function 
ft: R? — R? 
(25 y') => (as —y’). 


Solutions to exercises 


This can be expressed in terms of xz and y as 

ft: R? — R? 

(x,y) —> (x, —y). 

(In this case, f7! is actually equal to f, which is 
what we would expect for a reflection.) 
(c) The graph of this function is an upward 
sloping straight line, which suggests that it is 
one-to-one. First we confirm this algebraically. 
Suppose that f(x1) = f(x2); then 


827, +3 = 8x2 + 3, 


so 8x1 = 8x9, and hence zı = rg. Thus f is 
one-to-one, and so it has an inverse function. 


We now find the image set of f. We suspect that 
its image set is R, so we now prove this 


algebraically. Let y be an arbitrary element in R. 


We must show that there exists an element x in 
the domain R such that 


f(x)=y; thatis, 8%1+3=y. 

Rearranging this equation, we obtain 
=t 2 
=n 


This is in R and satisfies f(x) = y, as required. 
Thus the image set of f is R. 


Hence f~! is the function 
fi:R—OR 


fi:R—R 


Solution to Exercise A34 
The function 
g : [-7/2, 7/2] — [-1,1] 
z — sing 
is a restriction of f that is one-to-one. 


(There are many other possibilities, for example, 
the restriction of the domain to [r /2, 37/2].) 


93 


Unit A1 Sets, functions and vectors 


Solution to Exercise A35 
(a) The rule of go f is 
(go f)(x) = g(f(x)) = g(-2) 
= 3(-2) +1 
=-—3r+1. 
Thus go f is the function 
gof:R—R 
t ı—> -3r + 1. 
(b) The rule of f o g is 
(£ o g)(x) = f(g(z)) = fx + 1) 
= — (3x + 1) 
===], 
Thus f o g is the function 
fog:R—R 
rr > —3a2—1. 


Solution to Exercise A36 
The rule of fog is 
Thus f og is the function 
fog:R? — R? 
(x,y) =} (=z, —y). 
(In this case, fog =go f.) 


Solution to Exercise A37 
The rule of go f is 
(go f)(x) = g(f(x)) = g(3x + 1) 
3 
(82 +1)+2 


The domain of go f is 
{x € [-1,1]: f(z) € R — {-2}}. 


If x € [—1,1], then f(x) € R— {—2} unless 
f(x) = —2. Now f(z) = —2 when 


3x + 1 = —2, 
that is, when 


x=-l. 


94 


So the domain of go f is 


[—1, 1] — {-1} = (-1, 1]. 
Thus g o f is the function 
gof:(-1,1] — R 
1 

x+1 


Solution to Exercise A38 
The domain of f is R, and for each x € R we have 
5r — 3) +3 
g(f(2)) = 9 (62-3) = EIH _ 
that is, go f = ig. 


The domain of g is also R, and for each y € R we 
have 


slaty) =F (H) =5 (HF) -35u 


that is, fog = ip. 


Since go f = ig and f og = ig, it follows that g is 
the inverse function of f. 


Solution to Exercise A39 


This is a translation of the plane that shifts each 
point to the left by 1 unit and up by 3 units, so we 
expect its inverse to shift the plane to the right by 
1 unit and down by 3 units. 
Let 
g : R? — R? 
(x,y) — («+ 1,y — 3). 
The domain of f is R?, and for each (x,y) € R? we 
have 
gf (z,y)) = g(x — 1,y +3) 
= («#—-1+1,y+3-3) 
= (x,y); 
that is, go f = ig2. 
The domain of g is also R?, and for each 
(x,y) € R? we have 


=(z+1-1,y-3+3) 
= (x,y); 
that is, f o g = ips. 


Since go f = ig2 and f og = ipz, it follows that g 
is the inverse function of f. 


Solution to Exercise A40 


The vector d is in the same direction as a, but 
none of the other vectors is; also, the magnitude of 
d is two-thirds that of a. Hence 


d= ża and a= 3d. 


Next, e is parallel to b but in the opposite 
direction; none of the others is parallel to these two 
vectors. Also, the magnitude of e is three times 
that of b. Hence 


e=-—3b and b= —fe. 


Finally, c and f are not multiples of any of the 
other vectors. 


Solution to Exercise A41 


The vector 3d is in the same direction as d, but its 
magnitude is three times that of d; the vector —2d 
is in the opposite direction to that of d, and its 
magnitude is twice that of d. 


y 


aY 


Solutions to exercises 
Solution to Exercise A42 
We use the rule for forming a scalar multiple of a 


vector, and the Triangle Law for the addition of 
vectors. 


YA p 
+ 
Pprq q 2p + lg 
3q 
P 2p 
T 


Solution to Exercise A43 


\ ii Ui JO ay 


x 


YA 


Solution to Exercise A44 
(a) Here p = (3, —1) and q = (—1, —2), so 
p +q = (3+ (-1),-1+ (—2)) = (2, —3), 


(ol). -1 — (-2)) = (4,1). 
(b) Here p = —i — 2j and q = 2i — j, so 
p+q=(-1+ 2)i+ (-2+ (-1))j =i- 3j, 
= Hie, 
p=q=(-t=2 2 Lie j. 
(c) Here p = —i + 2k and q = i — 2j — k, so 
p+q=(-14+1)i- 2j + (2 -— 1)k = -2j + k, 
=q = —i + 2j +k, 
p—~-q= (-1- 1)i-— (2j) + (2 - (—1))k, 
= —2i + 2j + 3k. 


95 


Unit A1 Sets, functions and vectors 


Solution to Exercise A45 
(a) Since p = (3,—1) and q = (—1, —2), 
2p = (6, —2), 
3q = (=3; =6); 
2p — 3q = (9,4). 
The magnitude of q is 
lal = /(-1)? + (2)? = v5. 
(b) Since p = —i+ 2k and q=i- 2j—k, 
2p = —2i + 4k, 
3q = 3i — 6j — 3k, 
2p — 3q = —5i + 6j + 7k. 
The magnitude of q is 


lal = VOF (-2? + CI? = vo 


Solution to Exercise A46 
(a) When v = (2, —3), the magnitude of v is 


lv = VZ + (-3)? = V449 = V13, 


so 
1 2 3 


VSS a9) Se), 
Ji. ) (- 13 v =) 
(b) When v = 5i + 12j, the magnitude of v is 


lv] = V52 + 12? = V25 + 144 = 13, 


so 
12 


v= —v= i+ >j. 


lv} 13 13 


Solution to Exercise A47 
(a) Since p = (5,3) and q = (1,4), 
p—q=(4,-1), 
p+q= (6,7), 
3P + 94 = (3,5) + (2,2) = (3,9). 


96 


Solution to Exercise A48 


(a) The vector form of the equation of | is 
r = (1 — A)(3, 1) + A(2, 3) 
= (3—A,14+2)). 
(b) Using the formula above with A = Z, 3 and —4 
in turn, we obtain the following position vectors: 
n=@-$14)=(9, 
rə = (3 — 3,1+3) = (8,4), 
r3= (3—(—3),1+(-))=G,9)- 
Thus the three points on the line are the points 


R,, Rə and R3, with coordinates ($, 4), (3,4) and 
(Z, 0), respectively. 


(c) yr 


Solution to Exercise A49 

(a) The vector form of the equation of | is 
P= 8-21): 

Hence at the point (4,—1) on l, we have 
(4,-1) = (8—A,1+2)). 


Equating the corresponding components gives 


=3-—X and —1=1-+2A. 


The first equation gives A = —1, and this value of 
A also satisfies the other equation. Hence the value 
of A corresponding to the point (4,—1) in the 
vector form of the equation of l is A = —1. 


(b) The point (5,4) lies on J if and only if there is 
some real number A for which 

(4,5) =(8-A, 1422). 
Equating corresponding components gives 


3-A=4 and 1+2=4 


The first of these equations has solution À = 3, and 
the second has solution À = —4. 


It follows that there is no real number A that 
satisfies the vector form of the equation of l, when 
r= (5, 5); so the point (5, 5) does not lie on l. 


Solution to Exercise A50 
(a) The vector form of the equation of the line / is 
r= (1 — A)(2,1,0) + A(1, 0, -1) 
= (2-24, 1-2, —-A). 
(b) Using the formula above with \ = 2 and —1, 
we obtain the following position vectors: 


n= (2-1-4 =) 
= ($4-4). 

r2 = (2 — (—1), 1- (~1), )) 
= (2.2.1) 


Thus the two points have coordinates (3, a —3) 
and (3, 2,1). 
Solution to Exercise A51 


We use the formula for the scalar product of 
vectors in component form. 


(a) (2,3)- (3,-4) =2x +3 x (-4) 
=5-12=-7 

(b) (1,4)- (2-4) =1x 244% (-4) 
=2-2=0 

(c) (—2i+ j) - (3i — 25) 
= (—2) x 3+ 1 x (—2) 


=-6-2=-8 
(d) (1, =d; —2) . (3, =2, —5) 
=1x3+(-1) x (—2) + (-2) x (—5) 
=34+2410=15 


Solutions to exercises 


Solution to Exercise A52 


In each case we let u denote the first vector of the 
pair, v the second vector, and 0 the angle between 
the two vectors. 


(a) Here 


Hence 
9 u.v 13 13 
cos 0 = = = ——, 
lullv|  v17v29 493 
so 
13 
0 =cos! — 
V493 
= 0.95 radians (to 2 d.p.). 
(b) Here 
u: v= (2,2). (1, =1) = =2 = 2 = —4, 
|u| = /(—2)? + 2? = V4 F 4 = V8 = 2V2, 
lv| = yaa =V14+1=Vv2. 
Hence 
9 u.v —4 
cos 6 = = =-l, 
jully]  2/2/2 
so 


8 = cos~!(—1) = 7 radians. 
You might have expected this result, because u and 
v point in opposite directions (in fact, u = —2v). 
(c) Here 
u.v = (9i — 2j) - (i+ 2j) 
=9x1+(2) x2 
=z9=ds=b, 


jul = V9 + ae = V81 +4 = V85, 
[v| = v 1? + 2? = 


97 


Unit A1 Sets, functions and vectors 


98 


Hence 
9 uev 5 _. 1 
een fully] V85V5 vI? 
so 
0 = cos (=) 
= 1.33 radians (to 2 d.p.). 


Solution to Exercise A53 


In each case we let u denote the first vector of the 
pair, v the second vector, and @ the angle between 
the two vectors. 


(a) Here 
u:v= (3, 4,5) + (1,0, —1) 
=3x1+4x04+5-x (-1) 
jul = V3? + 4? + 5? = v50 = 5 v2, 
v| = V12 + 0? + (-1)2 = V2. 


—2, 


Hence 
u.v 
cos = 
ul|v| 
O —2 O 1 
O V22 5 
so 
il 
0 = cos™! | —= 
cos ( 3 
= 1.77 radians (to 2 d.p.). 


(b) Here 
u-+v = (2j— 3k) -(-i—j—2k) 
=0x(@h 40% CD 4Cayx 2) 
—2 +6 = 4, 
|u| = y 02 + 22 + (—3)? 
= V4 +9 = V13 


v (-1)? + (—1)? + (-2)? 
VI+1+4= V6. 


v= 


Hence 
u.v 
cos 0 = 
|ul|v| 
B 4 _ 4 
~~ /1BV6 V78 
so 


e (A) 


= 1.10 radians (to 2 d.p.). 


Solution to Exercise A54 


Points (x,y,z) that lie in the (y, z)-plane all have 
x = 0; so x = 0 is the equation of this plane. 


Similarly, points (x,y,z) that lie in the (x, z)-plane 
all have y = 0; so y = 0 is the equation of this 
plane. 


Solution to Exercise A55 


(a) This plane is parallel to the (x, y)-plane and 
passes through the point (0,0, 2). 


plane z = 2 


ZA 


(b) This plane is parallel to the (x, z)-plane and 
passes through the point (0, —1, 0). 


plane y = —1 


X 


Solution to Exercise A56 


We use the formula 
X'n=p-'n 


for the equation of a plane, where x = (x,y,z), n is 
a normal to the plane and p is a point in the plane. 


(a) Here n = (2,3,1) and p = (1,0, 2), so the 
equation of the plane is 


(x,y,z) (2,3,1) = (1,0,2) = (2,3,1). 
This can be expressed in the form 

2x +3y+z=1x2+0x3+2x1, 
that is, 

20+ 3y +z = 4. 


(b) Here n = (4, —2,1) and p = (—1, 1,5), so the 
equation of the plane is 


(x,y,z) + (4,—2,1) = (—1,1,5) + (4, —2, 1). 
This can be expressed in the form 


4x — 2y + z = (—1) x 4+ 1 x (—2)+5 x 1, 


that is, 


4z — 2y + z = —1. 


Solutions to exercises 


99 


Unit A2 
Number systems 


Introduction 


In this unit you will look at some different systems of numbers, and the 
rules for combining numbers in these systems. You have met many of these 
systems before, and you will study some of them in more detail later in the 
module. 


For each number system, you will consider which numbers have additive 
and/or multiplicative inverses in the system. You will also look at when 
and how we can solve certain types of equations in the system, such as 
linear, quadratic and other polynomial equations. The answers to these 
questions provide insights into the structure of the various number 
systems, and this in turn enables us to define abstract structures like fields 
and groups which share some or all of the properties of number systems 
and arise in many areas of mathematics. You will meet fields in this unit, 
and study groups in Books B and E of this module. 


1 Real numbers 


In this section you will revise real numbers, and some important subsets of 
the real numbers. You will meet a collection of rules that the arithmetic of 
real numbers satisfies, and see that some subsets of the real numbers also 
satisfy these rules, whereas others do not. Finally, you will look at 
polynomial equations with real coefficients and consider the number of 
solutions they have. 


1.1 Standard subsets of the real numbers 


The set of all real numbers is denoted by R. This set can be pictured as 
a number line, often called the real line. Each real number is represented 
by a point on the real line, and each point on this line represents a real 
number. Thus R is the set of all numbers that represent lengths along a 
line (and the negatives of such numbers). For example, the number 4 
corresponds to the point that lies a distance = from 0 in the positive 


direction, as shown in Figure 1. 


We sometimes refer to real numbers simply as reals. 


negative | positive 
o 3 2 1 0 14 2 30 
Figure 1 The real line showing the number $ 


1 Real numbers 


103 


Unit A2 Number systems 


1 
r? =1° +1? =2 


Figure 2 The diagonal of a 
square of side 1 


104 


The following standard subsets of the set R are used frequently in this 
module. You met some of them briefly in the previous unit. 


The set R* is the set of all non-zero real numbers. We can describe this set 
using set notation in various ways: 


R* =R- {0}, 
R* = (—oo, 0) U (0, 00), 
R*={xeER:x2 #0}. 


The set Z is the set of integers: 
Z= {...,—2,—1,0,1,2,...}. 

The set N is the set of positive integers, known as the natural numbers: 
Nine Zen SO) H41. 2, busca fs 


The set Q is the set of rational numbers. A rational number is a real 
number that can be expressed as a fraction whose numerator and 
denominator are integers. So we can describe Q using set notation as 
follows: 


Q = {p/q: p € Z, q E€ N}. 
Notice that the sets Q, Z and N are related as follows: 
NCZCcCęQ. 


That is, N is a subset of Z, which in turn is a subset of Q. 


Rational numbers and irrational numbers 


You have seen that R is the set of real numbers, and Q is the set of 
rational numbers, that is, the set of all numbers that can be expressed as 
fractions with integer numerators and denominators. The set Q is certainly 
a subset of the set R, because each rational number represents a length 
along the real line (or the negative of such a length), in the way indicated 
at the beginning of this subsection. But it is not obvious at first sight 
whether Q is a proper subset of R, or whether Q and R are in fact the 
same set. If it were possible to express every number that represents a 
length along the real line as a fraction with an integer numerator and 
denominator, then Q and R would be the same set. 


In fact, as you will know, they are not the same set: some numbers that 
represent lengths cannot be expressed as fractions with integer numerators 
and denominators. This is a fact that was discovered by, and was 
surprising to, the ancient Greeks. 


For example, consider the length of the diagonal of a square of side 1, as 
shown in Figure 2. If this length is x then, by Pythagoras’ Theorem, x 
must satisfy the equation x? = 2. However, there is no rational number 
that satisfies this equation. 


To see this, suppose that there is such a number, say « = p/q, where p and 
q are positive integers and p/q is a fraction in lowest terms (so there is no 
integer greater than 1 that is a factor of both the numerator and 
denominator). Since x = p/q satisfies the equation x? = 2, we know that 


which gives p? = 2q?, so p? is an even number. This tells us that p must 
also be an even number (because if p were odd, then p? would also be odd). 


Now, since p is even, we can write p = 2r, where r is a positive integer, so 
x = 2r/q. Since x satisfies the equation x? = 2, we have 
Ar? 9 
r 
q 
which gives q? = 2r?, so q? is an even number. In the same way as for p, 
this means that q must also be an even number. 


But this is impossible: p and q cannot both be even, because then 2 would 
be a factor of both numerator and denominator and p/q is defined as being 
a fraction in lowest terms. It follows that there is no such rational number 
p/q. That is, there is no positive rational solution of the equation x? = 2, 
and since the negative solution of the equation is obtained simply by 
changing the sign of the positive solution, we have proved the following 
theorem. 


Theorem Al 


There is no rational number x such that x? = 2. 


The proof that you have just seen is a classic example of a proof by 
contradiction. You will learn more about the technique of proof by 
contradiction, and other useful methods of proof, in the next unit, 

Unit A3, Mathematical language and proof. 


So the set Q is definitely a proper subset of R; that is, R contains numbers 
that are not in Q. For example, R contains the number V2, which is the 
positive solution of the equation z? = 2; thus (vI = 2. The set R also 
contains many other numbers that are not rational numbers, such as v3, 
V7 and V2 (where (7 = 2), and so on. Indeed, it can be shown that, if 
m and n are natural numbers, and the equation 2” = n has no integer 
solution, then the positive solution of this equation, written as %/n, cannot 
be rational. 


Other real numbers that are not rational include the number m, which 
denotes the ratio of the circumference of a circle to its diameter, and the 
number e, the base for natural logarithms. 


The real numbers that are not rational numbers are known as irrational 
numbers. 


1 Real numbers 


105 


Unit A2 Number systems 


In 1767, in a paper read before the Berlin Academy of Sciences, the 
Swiss mathematician Johann Heinrich Lambert (1728-1777) provided 
the first proof that 7 is irrational. Lambert was a close friend of 
Leonhard Euler (1707-1783), who had invited him to Berlin in 1764, 
and of Joseph-Louis Lagrange (1736-1813) who was Euler’s successor 
at the Berlin Academy after Euler returned to St Petersburg in 1766. 
In addition to this result on m, Lambert is well known for his work in 
geometry. 


We often refer to rational and irrational numbers simply as rationals and 
irrationals, respectively. 


Decimal expansions of rational numbers and irrational 
Johann Lambert numbers 


Every real number has a decimal expansion; for example, 


77 = 0.09090909..., 
14 = 1.25, 
m = 3.141592653589.... 


The decimal expansion of a rational number is always either a 
terminating (that is, finite) decimal, such as 1.25, or a recurring 
decimal, such as 0.09090909..., in which the digits repeat in a regular 
pattern from some position onwards. The decimal representation of any 
rational number p/q can be obtained by using long division to divide q 
into p. 


On the other hand, the decimal expansion of an irrational number is 
neither finite nor recurring. Instead, it continues for ever, with no pattern 
of digits that repeats indefinitely, such as 7. 


Every possible decimal number, finite or infinite, recurring or 
non-recurring, represents a real number. 


1.2 Arithmetic of real numbers 


Throughout your previous mathematical studies you will have used various 
rules of arithmetic whenever you carried out a calculation or an algebraic 
manipulation. For example, you will be familiar with the rule that the 
order in which you add or multiply two numbers does not affect the result, 
and with the rules for multiplying out brackets. Many of these rules of 
arithmetic come from the eleven simple properties of addition and 
multiplication of real numbers given in the box below. 


106 


Properties for addition 


A1 Closure For all a,b € R, 
a+beR. 

A2 Associativity For all a,b,c € R, 
at+(b+c)=(a+b)+c. 

A3 Additive identity For all a € R, 
GF 0 = a = O F 


A4 Additive inverses For each a € R, there is a 
number —a € R such that 


a+ (—a) = 0 = (—a) + a. 
A5 Commutativity For all a,b € R, 
a+b=b+a. 


Properties for multiplication 


M1 Closure For all a,b € R, 
axbeR. 

M2 Associativity For all a,b,c E€ R, 
G0) Gav eae: 

M3 Multiplicative identity For all a € R, 
<x l=@= 1X @. 


M4 Multiplicative inverses For each a € R*, there is a number 
a~' € R such that 


M5 Commutativity For all a,b € R, 
axb=bxa. 


Property combining addition and multiplication 


D1 Distributivity For all a,b,c € R, 
ax (b+c) =(ax 6) + (ax). 


1 Real numbers 


107 


Unit A2 Number systems 


108 


For clarity, the multiplication properties (M1 to M5) are shown in the 
above box using the symbol x but, as you will know, we often prefer to 
write simply ab for ‘a multiplied by b’, rather than a x b. 


The closure properties (Al and M1) simply say that adding or multiplying 
two real numbers results in another real number. 


The numbers 0 and 1 are known as the additive identity and 
multiplicative identity of R, respectively. The number —a in 

property A4 is known as the additive inverse or negative of a. The 
number a~! in property M4 is known as the multiplicative inverse or 
reciprocal of a. One number, namely 0, does not have a multiplicative 
inverse, since there is no number that multiplies with 0 to make 1, and so 0 
is excluded in the multiplicative inverses property (M4). 


The set of rational numbers, Q, also satisfies the eleven properties in the 
box above, in the sense that if R is replaced by Q throughout the box, 
then the properties are still true. (Of course, in property M4 the number 0 
is excluded, just as for the real numbers.) You will see later that the same 
properties hold for the set of complex numbers, C. However, if we restrict 
ourselves to the set of integers, Z, then one of these properties is no longer 
true, as you are asked to show in the next exercise. 


Exercise A57 


(a) Show that Z does not satisfy the multiplicative inverses 
property (M4) by giving an example of an integer that does not have 
a multiplicative inverse. 


(b) Which integers have a multiplicative inverse in Z? 


A set of numbers, with addition and multiplication defined in such a way 
that they satisfy the eleven properties in the box, together with a twelfth, 
rather trivial, property, namely that the additive and multiplicative 
identities are different numbers, is known as a field. (The twelfth property 
is included for technical reasons to ensure that the set {0} with addition 
and multiplication is not a field; it need not concern you in this module.) 


Thus a field is a number system that shares many of the properties of the 
arithmetic of the real numbers. You have seen that R and Q are fields, but 
that Z is not a field. 


1.3 Solutions of polynomial equations 


Even though the sets R, Q and C, with addition and multiplication, are all 
fields and hence share similar rules of arithmetic, they are quite different in 
other ways. 


Some of their differences are highlighted by considering which polynomial 
equations, with coefficients in the set in question, have a solution in that 
set. Here is a reminder of what we mean by a polynomial equation and its 
coefficients. 


Definitions 

A polynomial in x of degree n is an expression of the form 
Olt one eae an, 

where ag, @1,-..,@y, are numbers, called the coefficients of the 

polynomial, with a, 4 0. 


A polynomial equation in x of degree n is an equation of the form 
p(x) = 0, where p(x) is a polynomial in x of degree n. 


Polynomial equations (and polynomials) of degrees 1, 2 and 3 are 
called linear, quadratic and cubic, respectively. 


So, for example, the following are polynomials: 
iz’ — z? + V3, Qn—7, r?’ —2, 
and the following are polynomial equations: 
to — x? + V3 =0, 2¢—T=0, r? =2. 
(The third equation here is a rearrangement of x? — 2 = 0.) 


The equation z? = 2 is a polynomial equation with coefficients in Q, and 
you saw earlier in this section that this equation has no solution in Q. 
However, the equation z? = 2 can also be considered as an equation with 
coefficients in IR, and it does have solutions in R, namely the two solutions 
+,/2. In this sense, R seems a ‘better’ number system than Q. 


In the next exercise, you are asked to look at some linear equations, and 
consider whether they have solutions in the sets Q and R. 


Exercise A58 


(a) The following linear equations have coefficients in Q. Determine 
whether each of them has a solution in Q. 


(i) Br +10=0 (ii) 5a +1=0 


(b) The following linear equations have coefficients in R. Determine 
whether each of them has a solution in R. 


(i) 22-6=0 (ii) V8x+7=0 


1 Real numbers 


109 


Unit A2 Number systems 


110 


In fact, every linear equation with coefficients in Q has a solution in Q, 
because the equation ax + b = 0 where a,b € Q and a Æ 0 has exactly one 
solution, namely x = —b/a, which is rational. (Here we have used 
properties A2-A4 and M1-M5 to deduce that x = —ba~! € Q, although we 
usually express this using ‘division’ as x = —b/a € Q.) Similarly, every 
linear equation with coefficients in R has exactly one solution in R. 


Let us now look at quadratic equations. The example of the quadratic 
equation x? = 2 has already shown you that not every quadratic equation 
with coefficients in Q has a solution in Q. In the next exercise, you are 
asked to look at some quadratic equations with coefficients in R, and 
consider whether they have solutions in R. 


Remember that it is usually best to solve a quadratic equation by 
factorisation if you can. Otherwise, you can use the quadratic formula, 
which tells you that the solutions of the quadratic equation 

az? + bz + c= 0, where a,b,c € R and a Æ 0, are given by 


—b rv b2 — 4ac 
2a ` 


T = 


Exercise A59 


Solve the following quadratic equations, stating how many solutions each 
equation has in R. 


(a) 22-—T72+12=0 (b) 2?7+6r+9=0 (c) 2z? +5r-3=0 
(d) 227-22-1=0 (e) z?—2r+5=0 (f) z?—2vV3r+3=0 


Exercise A59 illustrates that some quadratic equations with coefficients 

in R have two solutions in R, some have only one and some have none. In 
either of the first two cases, the solutions may be rational or irrational. 
Although you may be accustomed to equations with integer coefficients 
such as those in Exercise A59(a)-(e), these facts still apply if some or all of 
the coefficients are irrational; that is, if the coefficients are any real 
numbers. 


So, although the set R seems ‘better’ than the set Q, working with R still 
does not enable us to find solutions of all quadratic equations. In Section 2 
you will see that working with the set of complex numbers, C, does enable 
us to find solutions of all quadratic equations, and in fact it enables us to 
find solutions of all polynomial equations. 


1.4 The Factor Theorem 


In the previous subsection we looked at the issue of whether polynomial 
equations with coefficients in Q or in R have solutions in Q or in R, 
respectively. We now confine our attention to polynomial equations with 
coefficients in R, and consider the maximum number of solutions that such 
an equation of degree n can have. For example, you already know that a 
linear equation (that is, a polynomial equation of degree 1) has exactly one 
solution, and a quadratic equation (that is, a polynomial equation of 
degree 2) has a maximum of two solutions. We also look at ways in which 
we can sometimes find some or all of the solutions of a polynomial 
equation. 


We will mainly discuss these issues in terms of polynomials, rather than 
polynomial equations. We make the following definition. 


Definition 


The roots (or zeros) of a polynomial p(x) are the solutions of the 
equation p(x) = 0. 


So finding the roots of a polynomial p(x) means the same as finding the 
solutions of the polynomial equation p(x) = 0. 


A polynomial with coefficients in R is called a real polynomial. 


You know that you can often find the roots of a quadratic polynomial by 
factorising it. Factorisation can also be useful for higher-degree 
polynomials. In general, if a polynomial p(x) can be expressed in the form 


p(x) = s(x)t(@), 


where s(x) and t(x) are polynomials whose degree is less than that of p(x), 
then we say that s(x) and t(x) are factors of p(x). 


The following theorem can help us to factorise polynomials. You will see a 
proof of this theorem in Unit A3. 


Theorem A2 Factor Theorem (in R) 


Let p(x) be a real polynomial, and let a € R. Then p(a) = 0 if and 
only if x — a is a factor of p(x). 


The phrase ‘if and only if’ is a means of stating two converse 
mathematical statements at once; here it tells us that the following two 
statements are both true: 


e If p(a) = 0, then x — a is a factor of p(x). 
e If x— a is a factor of p(x), then p(a) = 0. 


You will revise the use of the phrase ‘if and only if’ in more detail in 
Unit A3. 


1 Real numbers 


111 


Unit A2 Number systems 


112 


The following worked exercise demonstrates how you can use the Factor 
Theorem. It also demonstrates how, once you know that a particular 
polynomial p(x) has a factorisation of the form p(x) = (x — a)q(x), where 
you know the value of the root a, you can find the polynomial q(x) by 
equating corresponding coefficients, also known as comparing 
coefficients. 


Worked Exercise A24 


Show that x — 2 is a factor of the cubic polynomial 
p(x) = 2? + r? — x — 10, 


and find the corresponding factorisation of p(x). 


Solution 
®. Evaluate p(2) and apply the Factor Theorem. © 
We have 


p(2) = 2° +27 -2-10=8+4-2-10=0. 
So, by the Factor Theorem (Theorem A2), p(x) has the factor x — 2. 


®. So, since p(x) is a cubic polynomial, it must be the product of 
x — 2 and a quadratic polynomial. © 


Hence 
n a ip = O= ore SE ie AE 
for some real numbers a, b and c. 


®. To find the coefficients a, b and c of the quadratic polynomial, 
compare coefficients on each side of the equation. Start with the 
coefficients of the highest-degree terms and the constant terms. ® 


Equating the coefficients of x? gives 1 = a. Equating the constant 
terms gives —10 = —2c, soc = 5. Thus we have 


oe =m 10 (a — 2a ba D): 
®. We can compare the coefficients of x? or x; we choose 2”. & 
Equating the coefficients of x? gives 1 = —2 + b, so b = 3. Hence 
w Ei = = IO = (e = Gee ee es), 


@. We can equate the coefficients of x to check our answer. This gives 
—1=5-— 2b, so again b = 3, as expected. @ 


Exercise A60 


(a) For what value of k is x + 3 a factor of 
p(x) = x? + ka? + 6x + 36? 
(b) For this value of k, find the corresponding factorisation of p(x). 


The following theorem can be proved by repeatedly applying the Factor 
Theorem, as you will see in Unit A3. 


Theorem A3 
Let p(x) = ana” + an_12""!+---+ 12+ a9 be a real polynomial, 
and suppose that p(x) has n distinct real roots a1, a2,...,Q@p. Then 


p(x) = an(x — ai) (z — az) --: (£ — an). 


For example, the polynomial 
p(z) = 2a* — 8z? — 2a? + 32x — 24 


has the four distinct real roots 1, 2, 3 and —2, as you can check by 
evaluating p(1), p(2), p(3) and p(—2), and so (from Theorem A3) 


p(x) = 2(x — 1)(a — 2)(a — 3)(a + 2). 


In fact, as you will see in Subsection 2.4, every real polynomial p(x) of 
degree n has a factorisation of the form given in Theorem A3, although the 
roots Q1,Q@2,...,@, need not be distinct and may include non-real complex 
numbers. We have the following. 


A real polynomial of degree n has at most n distinct roots (some of 
which may be complex numbers). 


We now look at ways in which you can sometimes find some or all of the 
roots of a real polynomial. 


The following useful observation should be familiar from your previous 
studies of factorising quadratics. If you multiply out the brackets 


(x — a)(x — 8), 


where a and ĝ are real numbers, then you obtain a quadratic polynomial 
p(x) = x? + bx + c such that 


e the value of c, the constant term, is œ; 


e the value of b, the coefficient of x, is — (a + 8). 


1 Real numbers 


113 


Unit A2 Number systems 


114 


We can make a similar observation about the result of multiplying out the 
n brackets 
(x — ay)(@ — az): (£ — an). 


When we multiply out these brackets, we obtain a polynomial of degree n 
such that the coefficient of x” is 1. This polynomial has the following 
properties. 


Suppose that 
z” + Gn—12" | +--+ + ax + a9 = (x — a1) (z — a2)-+- (£ — an), 


where @1, @2,...,@n are real numbers. Then 
e ag = (-1)"ayag +++ An; 


O pi = =e Oo On) 


For example, 


r? +x? — 5r +3 = (x — 1)(x — 1)(x + 3), 


so ao = 3, Qi] = 1, a2 = 1, ag = —3 and ay_; = 1 and we have 
3 = (—1)8 x 1 x 1 x (—3) 

and 
1=-(1+1-3). 


The expression for the constant term ag in the box above is obtained by 
comparing constant terms on each side of the equation at the top of the 
box. 


Similarly, the expression for the coefficient a,_1 is obtained by equating 
the corresponding coefficients of x”~!, as follows. When the brackets on 
the right-hand side are multiplied out, each term in x”~! arises by 
choosing the variable x from n — 1 of the brackets, and the constant term 
from the remaining bracket. Choosing the constant term from the first 
bracket gives —a,x2"~', choosing the constant term from the second 
bracket gives —agx"—!, and so on. Adding all these terms and comparing 
the resulting total coefficient with the coefficient of x”~! on the left-hand 
side gives an-1 = —(a1 + a2 +: + an). 


The observations in the box above can help us factorise a polynomial if we 
know that all of its roots are integers. 


Worked Exercise A25 


Given that all the roots of the polynomial 
p(x) = z? — 6x? — 9r + 14 


are integers, write p(x) as a product of linear factors. 


Exercise A61 


(a) Given that all the roots of the polynomial 
p(x) = z? — 9x? + 23x — 15 
are integers, write p(x) as a product of linear factors. 
(b) Given that all the solutions are integers, solve the equation 
r? — 3z? +4 =0. 
2 


Use the property relating the sum of the roots to the coefficient of x 
to write the equation as a product of linear factors. 


Exercise A62 


(a) Determine a polynomial equation whose solutions are 1, 2,3, —3. 


(b) Determine a cubic equation whose only solutions are 2 and 3. 


1 Real numbers 


115 


Unit A2 Number systems 


116 


2 Complex numbers 


In this section you will revise complex numbers and their properties. You 
will see how to find complex roots of certain polynomial equations, and 
how the complex exponential function can be used to represent complex 
numbers. 


2.1 What is a complex number? 


Earlier you saw that the real numbers correspond to points on the real 
line. In this subsection you will see that the complex numbers correspond 
to points in the plane. 


Complex numbers arise naturally as solutions of quadratic equations. You 
have seen that some quadratic equations have no solutions in R, that is, no 
real solutions. For example, you saw in Exercise A59(e) that the equation 
x? — 2% + 5 = 0 has no real solutions, because there is no real number 
whose square is —16. We can extend the set of real numbers to ensure that 
every quadratic equation has at least one solution. 


To do this, we introduce a new number, denoted by 7, which is defined to 
have the property that i? = —1. We assume that i combines with itself, 
and with real numbers, according to the usual rules of arithmetic. In 
particular, we assume that if we multiply i by any real number y then we 
obtain the product iy = yi, and if we then add this product to any real 
number x we obtain the sum «+ iy = a+ yi. Sums of this form are known 
as complex numbers, and they are the numbers we need to enable us to 
find solutions of every quadratic equation. 


Definitions 


A complex number is an expression of the form x + iy, where x and 
y are real numbers and i? = —1. The set of all complex numbers is 
denoted by C. 


A complex number z = x + iy has real part x and imaginary 
part y; we write 


Rez=d eel line = y: 


Two complex numbers are equal when their real parts and their 
imaginary parts are equal. 


Remarks 


1. Any real number x can be written in the form x + i0, and any complex 
number of the form x + 70 is usually written simply as x. In this sense, 
R is a subset of C. The complex number 0 + 70 is written as 0. 


2. We follow the usual practice of writing a general complex number as 
x + iy, but a particular complex number as, for example, 2 + 3i, rather 
than 2 + 73. 


We also write 2 — 3i rather than 2 + (—3)i, and we write 2+ 7/3 rather 
than 2 + V3i, to avoid confusion with 2 + v3i (where the number 7 is 
included under the square root). 


3. Note that Rez and Im z are both real numbers. For example, if 
z = 2 — 3i, then Rez = 2 and Im z = —3. 


4. A complex number of the form 0 + iy (where y 4 0) is sometimes called 
an imaginary number. 


You know that every positive real number has two square roots. When you 
are working with the complex numbers, every negative real number also 
has two square roots, as follows. 


Square roots of a negative real number 


For a positive real number d, the square roots of —d are +iVd. 


You can check that +iv d are square roots of —d by using the usual rules 
of arithmetic: 


(ziva) =; (va) s[i isd = i 


You will see in Subsection 2.4 why these are the only square roots of —d. 


We can solve quadratic equations that have no real solutions by using the 
fact in the box above, together with the quadratic formula. When we 
apply this formula to a quadratic equation that has no real solutions, we 
obtain a term in the numerator of the form +y —d, where d is a positive 
number. In real terms, this is meaningless, because the square root sign 
applies only to positive real numbers, or zero. However, when we are 
working with complex numbers, we can take this term to mean the two 
square roots of —d, which are as given in the box above. This is illustrated 
in the worked exercise below. 


The equation in this worked exercise is the one from Exercise A59(e), 
rewritten using z as the variable name. We often use the letter z for a 
complex variable (a variable that represents a complex number). 


2 Complex numbers 


117 


Unit A2 Number systems 


Worked Exercise A26 


Solve the quadratic equation 


z —2z+5=0. 


Solution 
The quadratic formula gives 


Vay =l Zæ A4 
Aaa SS 


= Il se 2a. 
2 2 2 


We can check that the two complex numbers found in Worked 
Exercise A26 satisfy the equation we were trying to solve. We use the 
usual rules of arithmetic, and substitute —1 for i? wherever it appears. 


For example, if z = 1 + 2i, then 
z2? — 2z +5 = (197)? —2(1+2i)+5 
=1+4i +4- 2-—4i+5 
=1+4i+4(—1)—2—4i+5 
=] +4i—4-2-—4i+5 
= 0. 
The solution z = 1 — 27 can be checked in the same way. 


Similarly, it can be checked that the method of Worked Exercise A26 will 
in general give us two complex numbers that satisfy the quadratic equation 
we are trying to solve. So the use of the number 7 enables us to find 
solutions of any quadratic equation. You will see later in this section that 
the use of į ensures that all polynomial equations have solutions, even 
those whose coefficients are themselves complex numbers. This, in turn, 
means that any polynomial can be factorised into a product of linear 
factors; for example, 


2” — 2z +5 = (z — (1+ 2i))(z — (1 — 24) 
= (z—1-2i)(z -1+ 24). 


Exercise A63 


Solve the following equations, giving all solutions in C. 
(a) z2 —4z+7=0 

(b) 2—iz+2=0 

(c) 23—32? +4z—2=0 (Hint: z= 1 is one solution.) 
(d) 2*-16=0 


118 


The complex plane 


Just as there is a one-to-one correspondence between the real numbers and 
the points on the real line, so there is a one-to-one correspondence between 
the complex numbers and the points in the plane. This correspondence is 
given by 


f:C 4 R? 
x+iy > (x,y). 


Thus we can represent points in the plane by complex numbers and, 
conversely, we can represent complex numbers by points in the plane. 
When we do this, we refer to the plane as the complex plane, and we 
often refer to the complex numbers as points in the complex plane. A 
diagram, such as Figure 3, showing complex numbers represented as points 
in the plane in this way is sometimes called an Argand diagram. 


The first widely recognised publication of this idea appeared in a 
manuscript Essai sur une manière de représenter les quantités 
imaginaires dans les constructions géométriques dated 1806 by a 
mathematician described only as Monsieur Argand. Until recently he 
was believed to be Jean-Robert Argand but further research by 

Gert Schubring (2001), first presented in 1998, has shown that this is 
a misattribution and his first name is unknown. 


(Source: Schubring, G. (2001) ‘Argand and the early work on graphical 
representation: New sources and interpretations’, Proceedings of the Wessel 
Symposium at the Royal Danish Academy of Sciences and Letters. 
Copenhagen, August 11-15 1998, pp. 125-146.) 


Real numbers are represented in the complex plane by points on the 
horizontal axis; this axis is called the real axis. Similarly, numbers of the 
form iy are represented by points on the vertical axis; this axis is called 
the imaginary axis. 


Exercise A64 


Draw a diagram showing each of the following points in the complex plane: 


243i, —3+2i, —2—i, 307 


2 Complex numbers 


~ 
© 
r 
| 
| 
| 
| 
| 
Re----e 
8 
+ 
~ 
© 


Figure 3 The complex plane 


119 


Unit A2 Number systems 


120 


2.2 Arithmetic of complex numbers 


Arithmetic operations on complex numbers are carried out as for real 
numbers, except that we replace i? by —1 wherever it occurs. 


Worked Exercise A27 


Let z1 = 1+ 22 and z2 = 3 — 4i. Determine the following complex numbers. 


(a) 21+ 22 (b) 21 — 22 (c) 2122 (d) 2 


Worked Exercise A27 illustrates how we add, subtract and multiply two 
given complex numbers. We can apply the same methods to two general 
complex numbers z1 = xı + iyı and zg = x2 + iy2, and obtain the following 
formal definitions of addition, subtraction and multiplication in C. 


Definitions 


Let 21 = xı + iyı and zg = %2 + iy2 be any complex numbers. Then 
the following operations can be applied. 


Addition z1 +22 = (41 + z2) + i(y1 + y2) 
Subtraction 21 — 22 = (x1 — z2) + i(y1 — ye) 


Multiplication z122 = (#122 — yiy2) + (way + x12) 


There is no need to remember or look up these formulas. For calculations, 
you can use the methods of Worked Exercise A27. Note that, since the 
usual rules of algebra hold, so do familiar algebraic identities such as 


(z1 + z2) = z + 22122 + ze 


and 


z = z2 = (z1 = z2)(z1 + z2). 


An obvious omission from the list of definitions in the box above is 
division. We will return to division after looking at the complex conjugate 
and modulus of a complex number. 


Exercise A605 


Determine the following complex numbers. 
(a) (3 — 5i) + (2 + 4i) (b) (2-— 32)(—3 + 2i) (c) (5 + 3i)? 
(d) (1+2)(7 + 27)(4 — i) 


Complex conjugate 


Many manipulations involving complex numbers, such as division, can be 
simplified by using the idea of a complex conjugate. 


Definition 


2 Complex numbers 


The complex conjugate Z of the complex number z = x + iy is the “p 


complex number x — iy. 


For example, if z = 1 — 27, then Z = 1 + 2i. In geometric terms, Z is the Figure 4 The complex 
image of z under reflection in the real axis, as shown in Figure 4. conjugate 


Exercise A66 


Let z1 = —2 + 3i and z2 = 3 — i. Write down 2 and zz, and draw a 
diagram showing 21, 22, 2] and Z in the complex plane. 


121 


Unit A2 Number systems 


The following properties of complex conjugates are particularly useful. 


Properties of complex conjugates 
Let z1, z2 and z be any complex numbers. Then: 


Ils Ales = Zl ap 2) 


e o DN 
XR 
+ 
xl 
II 
w 
Eg) 
D 
N 


To prove that property 1 holds, we consider two arbitrary complex 
numbers. Let z1 = 71 + iyı and zg = rg + iyo. Then 


zı + 2 = (41+ T2) + i(y1 + y2) 
= (z1 + T2) — i(y1 + y2) 
= (xı — ty1) + (x2 — iy2) 
= 21 + 2. 


Exercise A67 


Use a similar approach to prove that properties 2, 3 and 4 all hold. 


Modulus of a complex number 


We also need the idea of the modulus of a complex number. Recall that 
the modulus of a real number xv is defined by 


t wae >, 
|z| = . 
—z, ifx <0. 


For example, |7| = 7 and |—6| = 6. 


In other words, |x| is the distance from the point x on the real line to the 


Figure 5 The modulus of a origin. We extend this definition to complex numbers, as illustrated in 
complex number Figure 5. 
Definition 


The modulus |z| of a complex number z is the distance from the 
point z in the complex plane to the origin. 


Thus the modulus of the complex number z = x + iy is 


la| = ya ty 
For example, if z = 3 — 4i, then |z| = \/3? + (—4)? = v25 = 5. 


122 


Exercise A68 


Determine the modulus of each of the following complex numbers. 
(a) 54127 (b) 142 (c) —5 


The modulus of a complex number has many properties similar to those of 
the modulus of a real number. 


Properties of the modulus 
1. |z| > 0 for any z € C, with equality only when z = 0. 


2. aal = |z1| |ze| for any 21, z2 € C. 


Property 1 is clear from the definition of |z|. Property 2 can be shown to 
hold in a similar way to property 2 of complex conjugates in the solution 
to Exercise A67. 


The following useful result shows the link between modulus and distance in 
the complex plane. 


Distance formula for C 


The distance between the points z; and z2 in the complex plane is 
|z1 — 22l. 


This is obtained by applying Pythagoras’ Theorem to the triangle shown 
in Figure 6. The formula holds wherever the points zı and z2 are situated 
in the complex plane. 


A 


yl- S S 5 = 2% =% + iy 
|21 — z2| = y (£1 — £2} + (y1 — yo)? Wi — Ya 
Yat — === saena 
Z2 = T2 + y2, Tı — T2 i 
| | 
| | 
| | 
| | 
T2 tı 7 


Figure 6 The distance formula for C 


2 Complex numbers 


123 


Unit A2 Number systems 


Figure 7 ‘The complex 
conjugate 


124 


Exercise A69 


For each of the following pairs z1, z2 of complex numbers, draw a diagram 
showing zı and z2 in the complex plane, find z; — z2 and evaluate |z1 — 22|. 


(a) 2 =3+i, z =14 21. 
(b) 2=1;, m=i: 
6-3) a52 


(c) z 


The following properties describe the relationship between the modulus 
and the complex conjugate of a complex number. 


Conjugate-modulus properties 
i |z| = le tor all ze C, 
2, 22 = | for allz € ©. 


To see why these properties hold, let z = x + iy. Then 
Z= x — iy = x + i(—y), so 


[2] = Va? + (-y)? = Vz? +y? = |z]. 
This can also be seen geometrically in Figure 7, where the distances from 


the origin to both z and its complex conjugate Z are the same. We also 
have 


zZ = (x + iy)(z — iy) = 2? + izy — izy — Py? = r? +y? = |z}? 


Division of complex numbers 


The second of the conjugate—modulus properties in the above box enables 
us to find reciprocals of complex numbers and to divide one complex 
number by another, as shown in the next worked exercise. In exactly the 
same way as for real numbers, we cannot find a reciprocal of zero, nor 
divide any complex number by zero. 


Worked Exercise A28 


(a) Find the reciprocal of 2 — 5i. 


3-4 
b) Find th tient ——. 
(b) Find the quotien sy 


Solution 
(a) ®. To express the reciprocal 1/(2 — 5i) in the form a + ib, we 
multiply the numerator and denominator by 2 + 5i, the complex 
conjugate of the denominator 2 — 5i, and then use the second 
conjugate—modulus property. .© 
The reciprocal is 
E 
2-51 (2—57)(2+ 52) 
_ 245i 
(2 — 5il? 
_ 245i 
4425 
= á + i = 4 (2+ 5i). 


(b) ©. We multiply the numerator and denominator by 1 — 2i, the 
complex conjugate of the denominator 1 + 27, and then use the 
second conjugate-modulus property. .® 

S20 (aoe) 


L% (1+ 22)(1 — 22) 


_ 3-i-6i +27 
es? 

_1-% 

t+ 


4 
—-H=(1-7i). 


| 
our 


The method used in Worked Exercise A28, of multiplying the numerator 
and denominator by the complex conjugate of the denominator, enables us 
to find the reciprocal of any non-zero complex number z, and the quotient 
21/22 of any two complex numbers zı and z2, where z2 # 0. We can obtain 
general formulas as follows. 
For the reciprocal, we have 

1 1xzZ z 

-= —-=—,, fo z#0. 

z gRz kel 


If z = z + iy, then Z = x — iy and |z|? = £? + y?, so we obtain 


1 x— ty 


xt+iy ~ ge y2 
For the quotient 21 /z2, we have 


21 ZXZ ze 
— = —, for 20. 


Z2 Z2 X Z2 |z2|2’ 


2 Complex numbers 


125 


Unit A2 Number systems 


126 


If z1 = xı + iyı and zg = z2 + iyo, this can be rewritten as 


zı +iyı _ (wi t+iyi)(w2— tye) _ (ite + yy) + i(z2y1 — T192) 


z2 + iy2 z? + y2 z3 +y 


These formulas may be used in theoretical work, but for calculations of 
reciprocals and quotients it is simpler to use the method of Worked 
Exercise A28. 


Exercise A70 


Find the reciprocal of each of the following complex numbers. 
(a) 3-i (b) —1+2i 


Exercise A71 


Evaluate each of the following quotients. 


5 243i 
a, b 
@) 5 © maya 


Arithmetic properties of complex numbers 


The set of complex numbers C satisfies the eleven properties previously 
given for arithmetic in R. These properties are stated in the box below 
(their proofs are not given here). Since C satisfies these eleven properties 
(and also satisfies the twelfth, trivial, property mentioned in 

Subsection 1.2), it is a field, like R and Q. 


Arithmetic in C 
Properties for addition 


A1 Closure For all 2, z2 € C, 
zi +z €C. 

A2 Associativity For all 21, z2, z3 € C, 
zy + (z2 + 23) = (z1 + 22) + 23. 

A3 Additive identity For all z € C, 
z+ 0S z= l 2: 


A4 Additive inverses For each z € C, there is a 
number —z € C such that 


z + (=z) =0 = (2z) + z. 
A5 Commutativity For all 21, z2 € C, 


Zr 29 = 22 + 21. 


2 Complex numbers 


Properties for multiplication 


M1 Closure For all 21, z2 € C, 
i Xz EC. 

M2 Associativity For all 21, 22,23 € C, 
zi X (2 xX a) = (a1 x a) ee 

M3 Multiplicative identity For all z € C, 
a S S & 


M4 Multiplicative inverses For each z € C — {0}, there is a 
number z~! € C such that 


M5 Commutativity For all 21, z2 € C, 
BX 2) = By XS Fike 
Property combining addition and multiplication 
D1 Distributivity For all 21, 22,23 € C, 


zı X (z2 + 23) = (z1 X z2) + (41 X 23). 


In particular, 0 = 0 + Oi plays the same role in C as the real number 0 does 
in R: it is the additive identity. The number 1 = 1 + 07 plays the same 
role as 1: it is the multiplicative identity. We also have that the 
additive inverse (or negative) of z = x + iy is —z = —x — iy, and the 
multiplicative inverse (or reciprocal) of z = x + iy is 


However, one very important difference between the set of real numbers 
and the set of complex numbers is that, unlike the real numbers, the 
complex numbers are not ordered. 


Recall that, for any two real numbers a and b, exactly one of the three 
properties 


a<b, a=b, or a>b 


is true; this is what we mean by saying that the real numbers are ordered. 
But this is not the case for the complex numbers. For example, given the 
complex numbers 1 + 2i and —1 + 32, we cannot say that one of the 
following properties is true: 


14+22>-14+32 or 14+21=—-14+32 or 1421 < —-1+4 31. 


127 


Unit A2 Number systems 


Indeed, inequalities involving complex numbers make sense only if they are 
inequalities between real quantities, such as the moduli of the complex 
numbers. (Note that ‘moduli’ is the plural of ‘modulus’.) For example, 
inequalities such as 


|z—-2i]<3 or Rez>5 


are valid. 


2.3 Polar form 


You have seen that the complex number x + iy corresponds to the point 
(x,y) in the complex plane. This correspondence enables us to give an 
alternative description of complex numbers, using so-called polar form. 
This form is particularly useful when we discuss properties related to 
multiplication and division of complex numbers. 


Polar form is obtained by noting that the point in the complex plane 


rs =+WY associated with the non-zero complex number z = x + iy is uniquely 
T ly determined by the modulus r = |z| = y x? + y?, together with the angle 0 
h i (measured in an anticlockwise direction in radians) between the positive 
z ; > direction of the real axis and the line from the origin to the point, as 


shown in Figure 8. We have 
Figure 8 A complex number 


determined by its modulus and 
angle 


x=rcos@é and y= rsinð, 
so the complex number z can be expressed as 
z = x + iy = r(cos0 + isin 0). 


This description of z in terms of r and @ is not unique because the angles 
O+27,0+47, 0+67,..., also determine the same complex number. 
However, if we restrict the angle @ to lie in the interval (—7, 7], then the 
description is unique. (Some texts restrict 0 to lie in the interval [0, 27).) 


Note that for the complex number 0, which is represented in the complex 
plane by the origin, the value of r is 0, and @ is not defined. 


Definitions 


A non-zero complex number z = x + iy is in polar form if it is 
expressed as 


z = r(cos0 + isin 0), 


where r = |z| and @ is any angle (measured in radians anticlockwise) 
between the positive direction of the x-axis and the line joining z to 
the origin. 


Such an angle 0 is called an argument of the complex number z, and 
is denoted by arg z. The principal argument of z is the value of 
arg z that lies in the interval (—7, 7], and is denoted by Arg z. 


128 


2 Complex numbers 


The term principal argument is a shortened form of the more conventional 
‘principal value of the argument’. Some texts use rcis, rZ0 or (r,0) as 
shorthand for r(cos @ + isin 0). 


Sometimes we refer to z = x + iy as the Cartesian form of z, to 
distinguish it from the polar form. 


We now look at how to convert a complex number from polar form to 
Cartesian form, and vice versa. 


When carrying out such conversions, it is useful to remember the values in 
the table below, as these will help you in some special cases. You may find 
it easier to remember the triangles in Figure 9, from which you can work 
out most of the values in the table. 


Sines and cosines of special angles 


Figure 9 ‘Triangles for finding sines and cosines of special angles 


The following trigonometric identities are also helpful; they are included in 
the module Handbook. 


Useful trigonometric identities 


For any 0 € R, 
sin(7 — 0) = sin 9, sin(—0) = — sin 0, 
cos(m — 0) = — cos,  cos(—8) = cos 8. 


129 


Unit A2 Number systems 


You may be able to remember these identities by roughly sketching graphs 
of the sine and cosine functions, and using their symmetry. For example, 
we can sketch the sine function as in Figure 10. 


A 


sin(7 — 0) = sin 6, 


—0 


sin(a + 6) = sin(—0) = — sin 0 


Figure 10 A sketch of the sine function for working out symmetry identities 


Converting a complex number from polar form to Cartesian form is 
straightforward: we simply use the equations 


xz =rcos, y=rsind 


as above to find x and y given r and 0. This is demonstrated in the 
following worked exercise. 


Worked Exercise A29 


Express each of the following complex numbers in Cartesian form. 


(a) 3 (cos £ + isin =) (b) cos (-3) ER (-2) 


130 


Exercise A72 


Express each of the following complex numbers in Cartesian form. 


(a) 2 (cos = + isin =) (b) 4 (cos (-=) Pen (-=)) 


To convert a non-zero complex number z from Cartesian form æ + iy to 
polar form r(cos@ + isin 0), we first find the modulus r using the formula 


pHs i aa. 
Then we find the principal argument 6; recall that this is the angle in the 
interval (—7,7] measured in an anticlockwise direction (in radians) 
between the positive direction of the real axis and the line from the origin 
to z. 


If z is either real or imaginary, then it lies on one of the axes and has 
principal argument 0, 7/2, m or —7/2, as shown in Figure 11. 


Figure 11 The principal argument 0 when z is real or imaginary 

Otherwise, to find the principal argument 0 we need to solve the equations 
cos 0 = Z and sin = 7 where 0 € (—7, 7]. 

We can do this by first finding the acute angle ¢ that satisfies the related 

equation 


cos @ = izl (or equivalently, sin ọ = lul or tang = 4) : 
r r T 


This acute angle ¢ is the angle at the origin in the right-angled triangle 
formed by drawing the perpendicular from z to the real axis, as illustrated 
in Figure 12 in the case where z lies in the second quadrant. (Remember 
that the quadrants are numbered as shown in Figure 13.) 


2 Complex numbers 


ly| 


Second 


First 


Third 


Fourth 


Figure 13 The quadrants of 


the plane 


131 


Unit A2 Number systems 


1/3 P 


Figure 14 A complex 
number z in the third 
quadrant, with ¢ = 7/3 


132 


Once we have found this acute angle ¢, we can find the principal argument 
0 by sketching z in the complex plane (the important thing is to get it in 
the correct quadrant), marking the acute angle ¢ on the sketch, and 
deducing the principal argument 0. For example, if z is in the third 
quadrant and ¢ = 7/3, then we can see from the sketch in Figure 14 that 


In fact, the relationship between the principal argument 0 and the acute 
angle ¢, for each of the four quadrants in which z can lie, is as shown in 
Figure 15. So, if you prefer, you can use the appropriate formula from 
Figure 15 to deduce 0 from ¢. You can also find the quadrant in which z 
lies by using the values of x and y, without having to sketch z in the 
complex plane. 


O=1-¢ 0=¢ 
INAL i 

ANG 
9 = —(w — 9) 9=—¢ 


Figure 15 The relationship between 0 and ¢ for each quadrant 


Both methods for finding @ are illustrated in the next worked exercise. 


Worked Exercise A30 


Express each of the following complex numbers in polar form, using the 
principal argument. 


(a) 2+2i (b) —3(1+iV3) 


The polar form of 2 + 2i in terms of the principal argument is 
therefore 


2/2 (cos + isin). 
(b) Let z = x + iy = —4(1 + iv3), so x =—5 and y = —V3/2. 


Then z = r(cos 0 + isin 0), where 


@. A sketch helps here. We have added on the values for ¢ and 0 
although they are not known when this is first sketched. .© 


—4 (1+iv3) 


To find 0, we calculate 


lal |=$ 

1 
So ¢ = 7/3, and from the drawing we see that 
6 = —(a — 6) = —2r7/3. 


The polar form of —;(1 + iv3) in terms of the principal 
argument is therefore 


1 
zo 


cos ġo 


Exercise A73 


For each of the following complex numbers, draw a diagram showing its 
location in the complex plane. Express the complex number in polar form 
using the principal argument, and mark this argument and the modulus on 
your diagram. 


(a) —=1+i (b) 1=ivV3 (c) -5 


2 Complex numbers 


133 


Unit A2 Number systems 


The following pair of trigonometric identities simplify multiplication of 
complex numbers in polar form; they are included in the module 
Handbook. 


More useful trigonometric identities 
For any 01,02 E€ R, 
sin(01 + 62) = sin 6; cos 62 + cos 9; sin 42, 
cos(ĝ1 + 82) = cos 0; cos 62 — sin 6; sin 2. 


Let 
zı =11(cos@; +isin@,) and z2 =re(cos 02 + isinə). 
Then, by the trigonometric identities above, 
2122 = r1 (cos 0i +isin 01) x r2(cos 65 + isin 02) 
= rır2(cos 01 + isin ĝ1)(cos 02 + isin 02) 
= rır2(cos ĝ1 cos 02 + isin 04 cos 62 + i cos 6; sin Ag + i? sin 04 sin 82) 
= rır2( (cos 01 cos 62 — sin 6; sin 02) + i(sin 41 cos 82 + cos 6; sin 62) ) 
= rıra(cos(01 + 82) +i sin(ðı + 02)). 
That is, to multiply two complex numbers in polar form, we multiply their 


moduli and add their arguments: 
2122 = rır2(cos(01 + 02) + isin(01 + 02)). (1) 


Worked Exercise A31 


Find the product z1z2 in polar form using the principal argument for the 
following complex numbers z1 and zg: 


z = 2 (cos Z + isin Z) and z =3 (cos Z + isin Z) 
i 4 4 sE 3 3/° 


134 


We can also use formula (1) for the product of two complex numbers in 
polar form to establish a similar formula for the quotient of two complex 
numbers. Specifically, we show that if 


zı = rı(cos 01 +isin6,) and zo = re(cosb) + isin 62), 
with z2 £0, which implies that r2 Æ 0, then z1/z2 is the complex number 
z=r(cos@+isin@), where r = r1/r2 and 6 = 0; — 02. 


To see this, notice that since rı = rrg and 6, = 0 + 62 it follows from the 
discussion above that z1 = zz2. Hence 21 /z2 = z, as required. 


That is, to divide a complex number zı by another complex number z2, we 
divide the modulus of zı by the modulus of z2, and subtract the argument 


of z2 from the argument of z1: 


= = —(cos(41 — 02) + isin(@1 — 02)), where z2 Æ 0. (2) 
2 T2 


Worked Exercise A32 


Find the quotient 21/22 in polar form using the principal argument for the 
following complex numbers zı and 22: 


zZ = 2 (cos + isin Z) and z =3 (cos = + isin 2) 
a 4 4 27 3 3/° 


In particular, if z = r(cos@ + isin 0) with r 4 0, then the reciprocal of z is 


L = Z (cos(0) — 0) +isin(0 — @)) 


= Z (cos(—6) + isin(—6)) 


1 
= —(cos 0 —isin@), 
F 
so we have the identity 


L = Z (cos(—0) + isin(—0)). (3) 


2 Complex numbers 


135 


Unit A2 Number systems 


The methods that you have seen for multiplying complex numbers in polar 
form can be generalised to apply to a product of several complex numbers. 
These methods are as summarised in the box below. 


Product and quotient in polar form 


e To multiply two (or more) complex numbers given in polar form, 
multiply their moduli and add their arguments. 


e To divide a complex number zı by a non-zero complex number z2 
when both are given in polar form, divide the modulus of z1 by the 
modulus of z2, and subtract the argument of z2 from the argument 
of Zj- 


If you want the principal argument of a product or quotient, then you may 
need to add or subtract integer multiples of 2r from the argument 
calculated, to obtain an angle in the interval (—7, 7]. 


Exercise A74 


Determine the product z122 and the quotient z1/z2 in polar form using the 
principal argument for the following complex numbers. 


(a) 2 =4 (cos (-5) +isin (-7)) and z2 = ; (cos + isin =). 
1 


(b) -3 A r d = ( = +isin >) 
A cos 3 isin 3 an m= 5 cos 5 isin 5 : 


Exercise A75 


Let z = —1 + å, zo =1—iV3 and z3 = —5. 


Z223 . š x n 
Express 212223 and —— in polar form using the principal argument. 
al 


3 3 
(You found in Exercise A73 that —1 +i = V2 | cos Z +isin =) ; 


1-if3 =2 (cos (-5) +isin (-5)) and —5 = 5(cos m + isin7).) 


2.4 Complex roots of polynomials 


We begin this section with a reminder of what we mean by the word ‘root’. 
In this unit, we use this term in two different, but related, senses, as given 
below. You met the first of these in Subsection 1.4. 


136 


Definitions 


If p(z) is a polynomial, then the solutions of the polynomial equation 
p(z) = 0 are called the roots of p(z). 


If a is a complex number, then the solutions of the equation z” = a 
are called the nth roots of a. 


The two uses of the word ‘root’ are related as follows: the nth roots of a 
are the roots of the polynomial z” — a. 


Recall that the roots of a polynomial are also called its zeros. 


In this subsection we look at how to find the nth roots of any complex 
number, and we consider the roots of polynomial equations more generally. 


We can obtain a useful result by considering what happens when we 
multiply together n copies of the same complex number in polar form. If 
z = r(cos + isin 0), then by the method that you saw above for 
multiplying complex numbers in polar form, we obtain 

(r(cos0 + isin 0))” = r” (cosn0 + isinn0), forn>1. (4) 
As before, the argument n0 may not be the principal argument of 
(cos 0 + isin)”, so we may need to add or subtract integer multiples of 27 
to obtain an angle in the interval (—7,7]. This is illustrated in the next 
worked exercise. 


Worked Exercise A33 


Find z+, where z = —1 + i. 


2 Complex numbers 


137 


Unit A2 Number systems 


You have seen that equation (4) holds for all n > 1; in fact, it is true for all 
integers. This follows from a result known as de Moivre’s Theorem. 


Theorem A4 de Moivre’s Theorem 


If z = cos 0 + isin 0, then for any n € Z, 


z” = (cos 0 + isin 0)” = cos n0 + isin nð. 


Abraham de Moivre (1667-1754) was a French mathematician who 
worked in England. He was part of the Huguenot flight from France 
after the revocation of the Edict of Nantes in 1685 and is first 
recorded as being in England in late 1686. De Moivre’s most 
important work is The Doctrine of Chances (1718), the first textbook 
for the calculus of probabilities. 


Abraham de Moivre 


To see that de Moivre’s Theorem is true for all integers, we need to also 
consider the cases where n = 0 and n is negative. We look at these cases 
separately, as follows. 


For n = 0, we have 
(cos @ + isin)? = 1, 
and 
cos(0 x 0) + isin(0 x 6) = cos0 + isin 0 
= j; 
Thus the result holds for n = 0. 
For n = —m, where m is a positive integer, we have 
(cos @ + isin 0)” = (cos + isin 0) ™ 
7 1 
~ (cos + isin 0)™’ 
and we know that (cos 0 + isin 0)™ = cos(m0) + isin(m8), since m is a 
positive integer. Therefore 
iL 
cos(m@) + isin(mé@) 
= cos(—m@) + isin(—mé) (by formula (3)) 


(cos0 + isin 0)” = 


= cos n + isin nð, 
as required. 


One application of de Moivre’s Theorem is in finding the nth roots of 
complex numbers; that is, in solving equations of the form z” = a, where 
a € C. Before you see how to do this, you are asked in the next exercise to 
use the theorem to verify some solutions of such an equation. 


138 


Exercise A76 


) Write down the complex number 1 in polar form. 


(b) Use de Moivre’s Theorem to show that each of the three complex 


numbers with polar forms 
of 2r .. OT Ar .. Ar 
cos 0 +7sin 0, poi asin and Os a 


satisfies the equation 2° = 1. 


(c) Write down the three solutions to the equation z = 1 given in 


part (b) in Cartesian form. 


The solution to Exercise A76 verifies that the three given complex 
numbers are solutions of the equation z? = 1. However, what we really 


want is a method that will enable us to find solutions of such an equation. 


Fortunately, de Moivre’s Theorem enables us to do this. The method is 
demonstrated in the next worked exercise. 


Worked Exercise A34 


Solve the equation z? = —27. Find the Cartesian form of each solution, 
and sketch the solutions in the complex plane. 


Solution 


@. Write the variable z in polar form, in terms of a variable 
modulus r and a variable argument 9. Also write the number on the 
right-hand side of the equation in polar form. .© 
Let z = r(cos 0 + isin). Also, —27 = 27(cosa + isin r). So the 
equation z? = 27 is 

r?(cos@ + isin 6)’ = 27(cos r + isin 7). 
®@. Use de Moivre’s Theorem to find the polar form of the left-hand 
side. ® 
By de Moivre’s Theorem, the equation can be written as 

r? (cos 30 + isin 30) = 27(cosa + isin 7). 
®. Find r by comparing moduli on each side. & 
Comparing moduli gives r° = 27, so r = 3. 


®. Now find 6 by comparing arguments on each side. One solution for 
0 is obtained by taking 30 = m. However, we could also take 
3m OM, (m. -aS areuments OF 21, S0 o0 = 3m, oF = om, 30 = IT, 

. also give solutions. In general, for any k € Z, the equation 


2 
30 = 7 F2kr, that is, 0 = E + iy gives a solution. However, as 


discussed after this worked example, we need consider only k = 0,1,2, 
as other values of k just repeat the same three solutions. .©@ 


2 Complex numbers 


139 


Unit A2 Number systems 


The possible values of 0 are given by 


m Hiem 
0 = — + — 
3 j 3 
for k = 0,1,2. So they are 
T ƏT 
0= ->= =. 
ae es 


®. Write out the solutions. It is convenient to label them as zg; that 
is, 20, 21, Z2. © 
Thus the solutions of the equation are 
=3 (cos Z +isin =) 
20 3 3 , 
zı = 3 (cosa +isin 7), 


= Cae om 
hy = COs nee ; 


We can write z2 using its principal argument as follows: 


z2 = 8 (cos (E) +isin(—T)). 


In Cartesian form, we have 
A = (EVB), z =-3, %2 = 3 (1 — iv3). 


A sketch of the solutions on the complex plane is given below. 


A 


zo = 3 (1 + iv3) 


= 3 (1-43) 


In Worked Exercise A34 we took k = 0,1, 2 in the formula 


g= a + 2kr 
3 3 
and obtained three corresponding solutions zg, 21, z2. Notice that if we 
take k = 3 in the formula, then we obtain 


Tt On T 
p= paLa 
T 


which gives solution zg again, since this value of 0 differs from the 


140 


argument of zo by an integer multiple of 27. You can check in the same 
way that if we take k = 4 then we obtain solution z1 again, and if we take 
k = 5 then we obtain solution z2 again, and so on. That is, if we take 

k = 0,1,2,3,4,..., in Worked Exercise A34, then after the third different 
solution the solutions repeat in an indefinite cycle. The same solutions are 
repeated if we take k to be a negative integer. 


We can use the method of Worked Exercise A34 to find the solutions of 
any complex equation of the form 
2 =a, 
where a is a known complex number. To do this, we start by writing both 
z and a in polar form so that, say, 
z=r(cos?+isin@) and a= p(cos¢+ising), 
where r and @ are variables whose values we must find, and p and ¢ are 
known real numbers. 
Then, by de Moivre’s Theorem, the equation z” = a can be written as 
r”(cosnð + isinn@) = p(cos ¢ + isin @). 
Hence we must have r” = p, so r = p'/". Also n0 must represent the same 
angle as @. We again use the fact that a complex number has many 


arguments, so adding any integer multiple of 27 to the argument ¢ of a 
gives the same complex number a. So we have 


nO = 6+ 2k, for any integer k, 
that is, 


2k 
d= ia + 2l for any integer k. 
n n 


If k = n we have 0 = ġ/n + 27, which represents the same angle as ¢/n. 
So taking k = 0,1,2,...,n — 1 will give the n different solutions of the 
equation z” = a. 


Exercise A77 


(a) Use the method described above to find the six solutions of the 
equation zf = 1 in polar form using the principal argument. 


(b) Sketch the position of each solution in the complex plane. 


(c) Write down the Cartesian form of each solution. 


2 Complex numbers 


141 


Unit A2 Number systems 


142 


In Exercise A77 you found the solutions of the equation zf = 1. These are 
known as the sixth roots of unity, and in the complex plane they are 
equally spaced around the circle of radius 1, centre the origin. More 
generally, the solutions of the equation z” = 1 are known as the 

nth roots of unity, and in the complex plane they are equally spaced 
around the circle of radius 1, centre the origin. For any n € N, the real 
number 1 is always one of the nth roots of unity. 


The nth roots of any complex number are also equally spaced around a 
circle with centre the origin, but the circle may not have radius 1 and there 
may not be a root on the real axis, as the following exercises illustrate. 


Exercise A78 


Solve the equation z+ = —4, expressing your answers in Cartesian form. 


Mark your solutions on a diagram of the complex plane. 


Exercise A79 


Solve the equation z? = 8i, expressing your answers in Cartesian form. 
Mark your solutions on a diagram of the complex plane. 


The next box summarises the method we have been using by giving a 
formula for the roots of a complex number. 


Roots of a complex number 


Let a = p(cos¢ + isin d) be a complex number in polar form. Then, 
for any n € N, the equation z” = a has n solutions, given by 


2 2 
T (cos (+=) +isin (S+=)). 
n n n n 


tor (=O, acc it = Ih 


This result gives the n solutions of any equation of the form z” = a, where 
a is a non-zero complex number. Now the equation z” = a, which can be 
written as z” — a = 0, is an example of a polynomial equation whose 
coefficients, 1 and —a, are complex numbers. Other examples of 
polynomial equations with complex coefficients are 


24+(1+iz+i=0 


and 


Caer +2iz? — 32 + (1 — 2i)z — 1 = 0. 


2 Complex numbers 


It can be shown that the following result holds; the proof is not included in 
this module. 


Theorem A5 The Fundamental Theorem of Algebra 


Every polynomial equation 
Ge eg ee | he eae o= A, 


where an, Gn—1,---,@9 E C and a, #0, has at least one solution in C. 


We say that a number system is algebraically closed if every polynomial 
equation with coefficients in this system has a solution in this system. 
Therefore, unlike the reals and the rationals, the complex numbers are an 
algebraically closed system of numbers. 


In 1799 Carl Friedrich Gauss (1777-1855), one of the greatest 
mathematicians of all time, published what is often considered to be 
the first satisfactory proof of the Fundamental Theorem of Algebra. 
However, Gauss himself was not satisfied with the proof and over the 
course of the next fifty years published three further proofs. Later 
Gauss’s original proof, which was mainly geometrical, was shown to 
be incomplete. In 1920 the gap in Gauss’s proof was filled by the 
Russian mathematician Alexander Ostrowski (1893-1986). 


Although we know from the Fundamental Theorem of Algebra that every 
polynomial equation with coefficients in C has at least one solution in C, 
finding solutions of such polynomial equations is not easy. However, there 
are a few theorems that can help us do this in some special cases. 


One of these theorems is the Factor Theorem (Theorem A2), which you 
met for polynomials where the number system is R in Subsection 1.4, but 
is also true if the number system is C, as stated below. 


Theorem A6 Factor Theorem (in C) Alexander Ostrowski 


Let p(z) be a polynomial with coefficients in C, and let a € C. Then 
pla) = 0 if and only if z — a is a factor of p(z). 


In this statement of the theorem, the letter z has been used in place of x, 
as this is the label usually used for a complex variable. The proof is 
otherwise exactly the same as the proof of the theorem in R, which you 
will see in Unit A3. 


The next theorem is also useful. It can be deduced from the Fundamental 
Theorem of Algebra and the Factor Theorem. The proof of this follows in 
a similar way to the proof of Theorem A3 that you will see in Unit A3. 


143 


Unit A2 Number systems 


144 


Theorem A7 


Every polynomial 
p(z) = anz” + an—12"7" + +++ + az + ao, 


where n > 1 and the coefficients are in C, with a, £0, has a 
factorisation 


p(z) = an(z — a1)(z — a2): +: (z = an), 


where the complex numbers qj, @2,...,Q@n are the roots (not 
necessarily distinct) of p(z). 


Together Theorems A5 and A7 tell us that a polynomial equation of 
degree n with coefficients in C has at least one solution in C, but can have 
no more than n solutions (all in C). Moreover, if ‘repeated’ solutions are 
counted separately, then a polynomial equation of degree n with 
coefficients in C has exactly n solutions (all in C). For example, the 
polynomial equation 


(z= 1)%(z+4)}(z-—i)=0 


has degree six and has exactly six solutions: the solution 1 is counted 
three times, the solution —4 is counted twice and the solution 7 is counted 
once. 


A third result that can help us find solutions of polynomial equations in 
some special cases is Theorem A8 below. You may have noticed that it 
follows from the quadratic formula that, for a quadratic polynomial with 
real coefficients, the roots are either both real or they occur as a complex 
conjugate pair. 


More generally, we have the following result. 


Theorem A8 


If p(z) is a polynomial with real coefficients, then whenever a is a 
complex root of p, so is @. 


This result is not proved here, but you might like to try to prove it 
yourself; it is included as a ‘challenging’ exercise in the additional exercises 
booklet for this unit. In addition, the factors z — a and z — @ of p(z) can 
be combined to give a real quadratic factor of p(z), namely 

(z-—a)(z-—@) = 2? -(a+a)z+ 0%, 


which has real coefficients, since a + @ = 2Rea and aa = |a|?. 


Worked Exercise A35 


(a) Show that z = i is a root of the polynomial 
plz) = 2° — 32 + 22? — 3z + 1. 
(b) Hence find all the roots of p(z). 


Solution 
(a) . Check that p(i) = 0. # 
We have 
WE = 9p an 
=1]1+31—-2-—3i+1 
=0, 
so į is a root of p(z). 


(b) ®. The polynomial p(z) has real coefficients, so for each complex 
root a, the complex conjugate @ is also a root. ® 


Since p has real coefficients, z = —i is also a root of p(z), so 
(z= i)(z +4) = 27 +1 is a factor of p(z). 


@®. We have 24 — 3z3 + 22? — 3z + 1 = (22 + 1)(az? +bz + 0). 
Equating the coefficients of z4, 2? and the constant term in this 
equation givesa=1,b=-3andc=1. # 


By equating coefficients, we obtain 
ea Oe Ss (Ce es ae ee 
So the remaining two roots of p(z) are the solutions of the 
equation 27 = 3z + 1 =0. 
Using the quadratic formula, we have 
_384V9-4 345 
g 2 a 
Hence the four roots of p(z) are i, —i, 4(3 + V5) and $(3 — V5). 


Zz 


Exercise A80 


(a) Show that z = 2i is a root of the polynomial 
p(z) = 24 — 22? + 72? — 8z + 12. 
(b) Hence find all the roots of p(z). 


Exercise A81 


Find, in the form anz” + --- + aız + a9, a polynomial whose roots are 1, 
—2, 3i and —3i. 


2 Complex numbers 


145 


Unit A2 Number systems 


146 


2.5 The complex exponential function 


The real exponential function f(x) = e”, also written as f(x) = expz, has 
the following properties: 


e? =1, e! =e, 1fe7=e*, forall z,yeR. 


We will consider the real function f(x) = e” in more detail in the analysis 
units (Books D and F), but here we extend the definition of this function 
to define a function f(z) = e* whose domain and codomain are C. 


We expect complex powers of e to satisfy the same basic properties as real 
powers of e. So, for example, we expect that 


ee? = et and i1fe*=e*, for all z, 21,20 €C. 


It turns out that, if this is to be achieved, then the definition of e* has to 
be as follows. 


Definition 
If z= x + iy, then e” = e” (cosy + isin y). 


Worked Exercise A36 


Use the definition above to show that 


e”! e”? = e71 t22 


for all complex numbers zı and 22. 


Exercise A82 


(a) Using the definition for e* above and de Moivre’s Theorem 
(Theorem A4), show that 
1 
e7 
(b) Use the results from part (a) and Worked Exercise A36 to show that 
e” Je? = e”? for all z1, z2 € C. 


=e”, foralzec. 


So the rules for multiplication and division of complex powers of e are 
exactly the same as those for real powers. Furthermore, when the 
exponent z is real, that is when z = x + 0i, where x € R, the definitions of 
a real and a complex power of e coincide, since 


e7 = e+? = e” (cos0 + isin 0) = e”. 


On the other hand, if z = 0 + iy, where y € R, then the definition gives the 
following formula. 


Euler’s Formula 


e = cosy + isin y. 


Putting y = 7 in Euler’s Formula, we obtain 
e" = cosa + ising = —1 + i0 = —1. 


This equation is usually written as follows. 


Euler’s Identity 
e™4+1=0. 


This is a remarkable relationship between five important numbers: 0, 1, 7, 
mt and e. 


2 Complex numbers 


147 


Unit A2 Number systems 


148 


In 1748, Leonhard Euler, in his famous Introductio in analysin 
infinitorum (Introduction to the Analysis of the Infinite), published 
the equations: 


ety -I = cosu + V—Isinv, 
and 
e °V—t = cosuv — V—Isinv. 


However, Euler himself never published what we now know as Euler’s 
Identity. 


Euler was also responsible for introducing the symbol 7 for the 
imaginary number with the property that i? = —1, and the symbol e 
to represent the base of natural logarithms, although he did not use 
the symbol 7 until 1777 and it was not published until 1794. 


The formula e* = cosy + isin y gives us an alternative form for the 
expression of a complex number in polar form. If 


z = x + iy = r(cos0 + isin 0), 


then we can write cos 0 + isin as e*°, so 


gZ=re 


A complex number expressed in this way is said to be in exponential form. 


A non-zero complex number z = x + iy = r(cos 0 + isin 0) is in 
exponential form if it is expressed as 


a= 


Rather than using the term exponential form, some texts regard re’? as 
another version of polar form, since it involves the modulus and angle of 
the complex number. 


When we use exponential form for complex numbers, de Moivre’s Theorem 
(Theorem A4) becomes the simple result 


(e®) =e’, for all 0 €R and all n € Z. 


Similarly, if z1 = rye and z2 = rge””2, the rules for multiplying and 
dividing complex numbers become the following simple results: 
z122 = ret! x roe? = ryrget1t 62) | 


Z1 rı ett 


_ T1 i(6;—02) ; 
= — = —e rovided z 0). 
29 ry ete T3 (p 2 ) 


There is also a useful formula for the complex conjugate of a complex 
number in exponential form, as follows. 


If z = re, then Z = re”. 
2 


This formula can be proved as follows. If z = re’’, then 
z = r(cos + isin 0), so 
Z = r(cos 0 — isin 0) 
= r(cos(—0) + isin(—0)) 
= re, 
The second line here follows from the trigonometric identities 
cos(—0) = cos@ and sin(—0) = — sin 8. 


Exercise A83 


Use Euler’s Identity to prove that if z = re’, then —z = rel +7), 


2.6 Summary: Cartesian, polar and 
exponential form 


You have seen in the previous subsections that certain calculations with 
complex numbers are considerably easier in some forms than in others. For 
example, if we use polar form or exponential form, then we can easily find 
powers using de Moivre’s Theorem (Theorem A4). Here is a summary of 
the main features of the different forms of a complex number, and how to 
convert between them. 


Let 


z =x + iy = r(cos0 + isin 0) = re”, 


zı = zı + iyı = rı(cos 0, + isin 01) = rieh, 


z2 = £2 + iy2 = rə(cos ĝ2 + i sin 62) = roe? 


Complex conjugate 
Cartesian form Z= 2% -—iy 
Polar form Z=r(cos@ — isin 6) 


Exponential form Z= re” 


2 Complex numbers 


149 


Unit A2 Number systems 


Product 


Cartesian form Use the usual rules of arithmetic to find 2122. 
Polar form 2122 = rırə (cos(0; + 62) +7sin(@, + 62)) 


Exponential form =z, z2 = ryrget 1+ 42) 


Reciprocal (In each case, z £ 0.) 


Cartesian form l = 1 x a = f 
z z z ke 
Il oe 1 - 
Polar form = = —(cos(—0@) + isin(—0)) = —(cos@ — isin 0) 
z r T 


1 
Exponential form - = -e 
zZz rf 


Quotient (In each case, z £ 0.) 


4 24 2 ze 


Cartesian form — = — x == 5 
zo 22 =H [el 
z i = 
Polar form — = —(cos(6; — 62) +isin(@, — 62)) 
22 r2 


AL _ M i(61-62) 
z2 T2 


Exponential form 


Converting polar and exponential form to Cartesian 
form 


Use the equations 


x=rcosé, y=rsinð. 


Converting Cartesian form to polar and exponential 
form 
Find the modulus r, using r = |z| = y£? + y?. 


Mark z on a sketch of the complex plane. Find the acute angle ¢ at the 
origin in the right-angled triangle formed by drawing the perpendicular 
from z to the real axis, using 

|z| 


cos¢ = —. 
r 


Hence find the principal argument. 


150 


3 Modular arithmetic 


In this section you will see how we can do arithmetic with finite sets of 
integers. We do this by using modular arithmetic, which you should have 
met in your previous studies. This type of arithmetic is important in 
number theory (the study of the integers) and in cryptography. You will 
use it frequently in the group theory units of this module (Books B and E). 


3.1 The Division Theorem 


If we divide one positive integer by another we obtain a quotient and a 
remainder. For example, 29 divided by 4 gives quotient 7 and remainder 1 
because 29 = 7 x 4+ 1. If we divide any positive integer by 4, the 
remainder will be one of the numbers 0, 1, 2, 3. 


This idea can be extended to the division of a negative integer by a 
positive integer. For example, —19 divided by 4 gives quotient —5 and 
remainder 1 because —19 = (—5) x 4+1. If we divide any negative integer 
by 4, the remainder is again one of the numbers 0, 1, 2, 3. 


This result can be generalised to the following theorem. 


Theorem A9 Division Theorem 


Let a and n be integers, with n > 0. Then there are unique integers q 
and r such that 


a=qn-+r, withO<r<n. 


We say that dividing a by the divisor n gives the quotient q and 
remainder r. 


A formal proof of Theorem A9 is not given here, but the theorem can be 
illustrated as follows. We mark integer multiples of n along the real line as 
shown in Figure 16, and then observe in which of the resulting intervals of 
length n the integer a lies. Suppose that a lies in the interval 

lan, (q + l)n), so that gn < a < (q + l)n, as illustrated. 


a- qn=r 
<—_ 
a 
wean ! : [Jp ------ 4 #4 — 
—2n -n 0 n qn (q+1)n 


Figure 16 The number a in the interval [gn, (q + 1)n) 


Then, if we let r = a — qn, we have a = qn +r and 0 < r < n, which is the 
required result. 


3 Modular arithmetic 


151 


Unit A2 Number systems 


152 


Exercise A84 


For each of the following integers a and n, find the quotient and remainder 
on division of a by n. 


(a) @=65, n=7 (b) a= —256, n=13 


Exercise A85 


(a) What are the possible remainders on division of an integer by 7? 


(b) Find two positive and two negative integers all of which have 
remainder 3 on division by 7. 


3.2 Congruence 


The Division Theorem (Theorem A9) tells us that, when we divide any 
integer by a positive integer n, the set of possible remainders is 
{0,1,2,...,n—1}. Integers that differ by a multiple of n have the same 
remainder on division by n and are, in this sense, ‘the same’ as each other. 
We now introduce some notation and terminology for this idea of 
‘sameness’, which is known as congruence. 


Definitions 


Let n be a positive integer. Two integers a and b are congruent 
modulo n if a— b is a multiple of n; that is, if a and b have the same 
remainder on division by n. 


In symbols we write 
a =b (mod n). 


Such a statement is called a congruence, and n is called the 
modulus of the congruence. 


The word ‘modulus’ here has a different meaning from its use to mean the 
‘size’ of a real number or a complex number. This different usage reminds 
us that it is always important to interpret technical terms according to 
their context. 


We read ‘a = b (mod n)’ as ‘a is congruent to b modulo n. 


The terms ‘congruent’ and ‘modulus’, together with the symbol for 
congruence, all appear for the first time in Gauss’s classic text 
Disquisitiones Arithmeticae (Arithmetical Investigations) of 1801, the 
work which, in the words of historian Olaf Neumann, ‘transformed 
number theory from a scattering of islands into an established 
continent of mathematics.’ 


Worked Exercise A37 


Which of the following congruences are true, and which are false? 
(a) 27 =5 (mod 11) (b) 14 = —6 (mod 3) 
(c) 343 = 207 (mod 68) (da) 1=-1 (mod 2) 


Exercise A86 


Find the remainder on division by 17 of each of the numbers 25, 53, —15, 3 
and 127, and state any congruences modulo 17 that exist between these 
numbers. 


3 Modular arithmetic 


153 


Unit A2 Number systems 


We shall need to use some properties of congruences in the following 
sections, so we state these properties here. This may seem a long list, but 
these properties are quite simple; in fact, they are what you might expect. 


Theorem A10 Properties of congruences 


Let n and m be positive integers, and let a, b, c, d be integers. The 
following properties hold. 


Reflexivity a= a (mod n). 
Symmetry If a= b (mod n), then b = a (mod n). 


Transitivity If a =b (mod n) and b = c (mod n), then 
a = c (mod n). 

Addition If a=b (mod n) and c = d (mod n), then 
a+c=b+d (mod n). 

Multiplication Ifa = b (mod n) and c = d (mod n), then 
ac = bd (mod n). 


Powers If a=b (mod n), then a” = b™ (mod n). 


To see why these properties hold, we use the definition of congruence: two 
integers a and b are congruent modulo n if a — b is a multiple of n. 


The reflexive property holds because a — a = 0 = 0 x n, so we have 

a =a (mod n). 

To see why the symmetric property holds, suppose that a = b (mod n), so 
a — b = kn for some integer k. But b — a = —(a— b), so b — a = (—k)n. 
Since —k is also an integer, it follows that b = a (mod n). 


We can see that the transitive property holds in a similar way. Suppose 
that a = b (mod n) and b = c (mod n). Then a — b = kn and b — c = In for 
some integers k and l. Hence 


a—c=a—b+b-c= kn+lin=(k+l)n. 


Since k + l is an integer, it follows that a = c (mod n). 


In the next worked exercise we prove that the addition property holds, and 
you are asked to prove the multiplication property in Exercise A87. 


Worked Exercise A38 


Prove that the addition property for congruences holds: 


if a = b (mod n) and c = d (mod n), then a +c = b+ d (mod n). 


154 


Exercise A87 


Prove that the multiplication property for congruences holds: 


if a = b (mod n) and c = d (mod n), then ac = bd (mod n). 


The powers property is obtained by applying the multiplication property 
repeatedly. Suppose that a = b (mod n). Then the multiplication property 
gives 


a? = b? (mod n). 
We can now apply the multiplication property to 
a=b(modn) and a? =b (mod n). 
to obtain 
a® =b? (mod n). 
Continuing in this way, we obtain 
a™ = b™ (mod n) for any m € N, 
which is the powers property. 


The properties of congruences in Theorem A10 are particularly useful 
when we want to find the remainder of a large integer on division by 
another integer, as the next worked exercise illustrates. 


3 Modular arithmetic 


155 


Unit A2 Number systems 


Worked Exercise A39 


(a) Find the remainders of both 2375 and 5421 on division by 22. 
(b) Find the remainder of 2375 x 5421 on division by 22. 
(c) Find the remainder of (2375)! on division by 22. 


Solution 


(a) ®. Start with 2375 and subtract or add convenient multiples of 
22 until you reach an integer in {0,1,2,...,21}. Here we can 
subtract 2200, then 110, then 66, then add 22. © 


Using the transitivity property of congruences we obtain 


@®. Do the same for 5421. We can subtract 4400, then 880, then 
110, then 22. ® 


Similarly, 


5421 = 1021 = 141 = 31 = 9 (mod 22). 


So 2375 has remainder 21 on division by 22, and 5421 has 
remainder 9 on division by 22. 


(b) ©. Use the multiplication property. Find integers congruent 
modulo 22 to 2375 and 5421 that are easier to multiply. & 


Using the multiplication property of congruences and the answer 
to part (a), we obtain 


2375 x 5421 = 21 x 9 = —1 x 9 = —9 = 13 (mod 22), 
so 2375 x 5421 has remainder 13 on division by 22. 


(c) ©. Use the powers property. Find an integer congruent 
modulo 22 to 2375 whose powers are easier to find. ® 


Using the powers property of congruences and the answer to 
part (a), we obtain 


to = = (=1)" = b= 21 (mod 22), 
so (2375)! has remainder 21 on division by 22. 


The worked exercise above shows that there is a real advantage in using 
congruences, since the number (2375)!° is too large to fit into the memory 
of most computers. 


156 


Exercise A88 


(a) Find the remainder of both 3869 and 1685 on division by 16. 
(b) Find the remainder of 3869 + 1685 on division by 16. 


(c) Find the remainder of (3869 + 1685)* on division by 16, and hence 
find the remainder of (3869 + 1685)!"! on division by 16. 


3.3 Operations in Z,, 


The Division Theorem (Theorem AQ) tells us that all the possible 
remainders on dividing an integer by a positive integer n lie in the set 


{0,1,... n — 1}. 


We denote this set by Zn. For each integer n > 2 we have a set Zn, and it 
is on these sets that we perform modular arithmetic. The modular addition 
operations +n and modular multiplication operations Xn are defined as 
follows. 


Definitions 


For any integer n > 2, 
Zam {Oleh 


For a and b in Zn, the operations +n and Xp» are defined by: 
a +n b is the remainder of a + b on division by n; 


a Xn b is the remainder of a x b on division by n. 


The integer n is called the modulus for this arithmetic. 


We read a +n b as ‘a plus b, mod n’, and a Xn b as ‘a times b, mod n’. 
For example, Z7 = {0, 1,2, 3,4,5,6} and we have 
3+6=9, so3+76=2, 
3x6= 18, so3x76=4. 
You have certainly met some modular arithmetic before, as the operations 


+12 and +24 are used in measuring time on 12-hour and 24-hour clocks, 
respectively. 


Arithmetic carried out on the elements of the set Z, using the operations 
+n and Xp is called arithmetic modulo n. 


3 Modular arithmetic 


157 


Unit A2 Number systems 


Exercise A89 


Evaluate the following. 
(a) 3 +5 2 (b) 44475 (c) 8 +46 12 
(d) 3 X5 2 (e) 4 X17 5 (£) 8 X16 12 


You can often use the properties of congruences to help you carry out 
arithmetic modulo n efficiently, without using a calculator, as 
demonstrated in the next worked exercise. There are usually many 
different ways to proceed. 


Worked Exercise A40 


Evaluate the following. 
(a) 29 X31 18 (b) 12 X26 15 


Solution 


(a) ®@ Use the fact that 29 = —2 (mod 31), and it is easier to 
multiply by —2 than by 29. Remember that the final answer 
needs to be an integer in Z31. ® 


We have 
29 x 18 = —2 x 18 
= —36 
= —5 
= 26 (mod 31). 


Thus 29 x3; 18 = 26. 


(b) ®. Use the fact that to multiply by 12, we can first multiply by 2 
and then by 6. The final answer needs to be an integer 


in Zog. œ 
We have 
D Se Ils) =O Se Ze ILD 
= 6 x 30 
=6x4 
= 24 (mod 26). 


Thus 12 X26 1) = 24, 


Exercise A90 


Calculate the following without using a calculator. 
(a) 7 X27 26 (b) 16 X29 14 (c) 9 X33 15 (d) 37 X45 23 


(e) 15 X34 6 (£) 9 X40 18 


158 


In Subsection 1.2 you met a list of eleven properties that are satisfied by 
the set Q of rational numbers, the set R of real numbers and the set C of 
complex numbers. You saw that since these three sets each satisfy all 
eleven properties (together with a trivial twelfth property), these sets are 
all fields. You also saw that the set Z of integers does not satisfy all eleven 
of these properties, and so is not a field. In the rest of this section we will 
investigate whether the sets Z, satisfy these properties. 


We will also investigate which equations in Z, have solutions; for example, 
do the equations 
e+y25=2, 5xXyx=7, 42% =6 


have solutions? These may look much simpler than the equations we were 
trying to solve in C, but they pose interesting questions. We shall see that 
the answers may depend on the modulus that we are using. 


Before we consider these questions further, we look at addition and 
multiplication tables, which provide a convenient way of studying addition 
and multiplication in Zp. 


We consider addition first. Here are the addition tables for Z4 and Z7. 


47/0 1 2345 6 
tits, a ofo1 2345 6 
1/123 4560 
yee A 2123 45601 
tl hee 313456012 
al eee 4145 BM 0123 
ae 28 5/5 601234 
6/60 123 45 


In order to evaluate 4 +7 2, say, we look in the second table at the row 
labelled 4 and the column labelled 2 to obtain the answer 6. 


Exercise A91 


(a) Use the tables above to solve the following equations. 
(i) z+43=2. (ii) ct75=2. (iii) s+42=0. (iv) z+75=0. 


(b) What patterns do you notice in the tables? 


Exercise A92 


(a) Construct the addition table for Ze. 
(b) Solve the equations x +6 1 = 5 and z +6 5 = 1. 


3 Modular arithmetic 


159 


Unit A2 Number systems 


For every integer n > 2, the additive properties of Z, are the same as the 
additive properties of R, as follows. 


Addition in Z, (n > 2) 
A1 Closure For all a,b € Zp, 
a +n b € Zp. 
A2 Associativity For all a,b,c € Zn, 
(a +n b) +n c =a +n (b +n ©). 
A3 Additive identity For all a € Zn, 
at,0=a=O0+,y a. 


A4 Additive inverses For each a € Zn, there is a 
number b € Zn such that 


at,nb=0=b+,y, a. 
A5 Commutativity For all a,b € Zp, 
atnb=b+y,a. 


The closure property (A1) follows because a +n b is the remainder on 
dividing a + b by n, which, from the Division Theorem (Theorem A9), is 
in Zp. 

The other properties can be deduced from the corresponding properties for 
integers. For example, we can see that the associativity property (A2) 
holds as follows. By definition, (a +n b) +n c and a +n (b +n c) are the 
remainders of the integers (a + b) +c and a+ (b+ c), respectively, on 
division by n. Since ordinary addition is associative, we have 

(a+b) +c=a+(b+c), so 


(a +n b) +n c€ = a +n (b +n ©). 


Exercise A93 


By using the corresponding property for integers, prove that the 
commutativity property (A5) holds for Zn. 


The additive inverses property (A4) states that every element of Zn has an 
additive inverse in Z,. For example, 4 and 5 belong to Zg and 4 +9 5 = 0, 
so 5 is an additive inverse of 4 in Zg (and vice versa). 


Additive inverses are sometimes written in the form —,a; that is, if 
a+, b = 0, then we write b = —,a. For example, 5 = —o4. 


160 


Exercise A94 


(a) Use the addition table for Z7 (given earlier and repeated as Table 1) 
to complete the following table of additive inverses in Z7. 


a |0 12 3 4 5 6 
(b) Complete the following table of additive inverses in Z,,, justifying 
your answers. 


a O- IT 2) sae Pones Wed 


—na 


Notice that each element a of Zn has exactly one additive inverse in Zp, 
namely the integer obtained by subtracting a from n. For example, the 
additive inverse of 4 in Zo is 9— 4 = 5. 


The existence of additive inverses means that, as well as doing addition 
modulo n, we can also do subtraction modulo n. We define a —,, b or, 
equivalently, a — b (mod n), to be the remainder of a — b on division by n. 


For example, to find 2 —12 5, we have 
2—5=-3=9 (mod 12). 
Since 9 € Z492, it follows that 


2-129 =9. 


3.4 Multiplicative inverses in Z, 


In the last subsection it was stated that, for any integer n > 2, the set Zn 
satisfies the same rules for addition modulo n as the real numbers satisfy 
for ordinary addition. When it comes to multiplication in Zn, most of the 
familiar rules for multiplication of the real numbers are true. In particular, 
the following properties hold. 


Multiplication in Zn (n > 2) 
M1 Closure For all a,b € Zp, 
a Xan OS Din. 
M2 Associativity For all a,b,c € Zp, 
(a O) ne = Go Unt) 
M3 Multiplicative identity For all a € Zn, 
G Xo l = 0 = 1 Xa @e 
M5 Commutativity For all a,b € Zn, 
O Xp O = 0 Xa U 


3 Modular arithmetic 


Table 1 

+7 
0 | 0 
1 1 
2 | 2 
3/3 
4 |4 
5 | 5 
6 | 6 


. 


oO oan AeA WN Fe 


= oO AA AUNIJN 


Neoon AU J| 


wWwWNrFrOaOaA |e 


rPwnNynre OO ® o | oo 


or WNFrF TO DI1S 


161 


Unit A2 Number systems 


The following property also holds. 


Combining addition and multiplication in Z,, (n > 2) 
D1 Distributivity For all a,b,c € Zn, 


aX, (Op, 6) = (a Xn) fa (ax, ©) 


These properties can be shown to hold in a similar way to the additive 
properties. You will notice that one property is missing from the list of 
multiplicative properties, namely the multiplicative inverses property (M4). 


We say that b is a multiplicative inverse of a in Zn if a,b € Zn and 
aXnb=bXna= 1. We now investigate the existence of multiplicative 
inverses. 


Here are the multiplication tables for Z4 and Z7. 


X7 1 2 


ww ye oo 
D O 2 oS | oO 
w ye oo 
Uu O b o 
Fn WwW OD] w 


oe 2 & oo oo ©& | © 
Pr oN DW © |] w 
Wwnn or PO] FS 
oF Dr W Oo @® | Ot 
PnmnwrTDoOola 


ao A UNEO 
oor UNEO 
o uo- DD FN © 


The table for Z7 shows that, for example, 3 x7 5 = 5 x73 = 1, so 5isa 
multiplicative inverse of 3 in Z7. 


Exercise A95 


(a) Use the tables above to answer the following. 


(i) Which elements of Z4 have multiplicative inverses? 
(ii) Find a multiplicative inverse of every element of Z7 except 0. 


(b) Construct a multiplication table for Zio, and determine which 
elements of Zio have multiplicative inverses. 


In Exercise A95, you saw that, in contrast to R and C, there are some 
values of n for which the number system Z, contains non-zero elements 
that do not have a multiplicative inverse. 


162 


Before we investigate further the question of which elements of each 
number system Zn have a multiplicative inverse, note that if an element of 
Zn does have a multiplicative inverse, then it has only one. The 
multiplication tables for Z4, Z7 and Zio show that this is true for these 
three number systems, but it is in fact true for any Zn, though this is less 
obvious than it is for additive inverses. 


To see that it is true in general, suppose that a is an element of Zp, for 
some integer n > 2, and that both b and c are multiplicative inverses of a 
in Zn. Then 
b=1x,,b 

=CXnüXnb (since 6x, a — 1) 

=¢ex, 1. (sinceaXnb= 1) 

=; 
That is, b and c are in fact the same element of Zn. Thus a has just one 
multiplicative inverse in Zn. We say that the inverse of a is unique. 


When it exists, we denote the multiplicative inverse b of an element a of 
Zn by a~' and refer to it as the multiplicative inverse of a in Zn. 


Notice also that if an element a of a number system Z, has a 
multiplicative inverse b in Zn, then b also has a multiplicative inverse in 
Zn, namely a. This follows from the definition of a multiplicative inverse. 
For example, in Z7, the elements 3 and 5 are inverses of each other. 


Let us now turn to the question of which elements of each number system 
Zn have a multiplicative inverse. This question is connected with the 
common factors of a and n. 


Definitions 


Two integers a and b have a common factor c, where c is a natural 
number, if a and b are both divisible by c. 


Two integers a and b are coprime (or relatively prime) if their only 
common factor is 1. 


The highest common factor (HCF) of two integers a and b is their 
largest common factor. 


If two integers a and b are coprime, we also say that a is coprime to b, or 
that b is coprime to a. 


In some texts the highest common factor of a and 6 is called the ‘greatest 
common divisor’ (GCD). 


It turns out that an element a of Z, has a multiplicative inverse in Zn 
exactly when a and n are coprime. That is, if a and n are coprime, then a 
has a multiplicative inverse in Z,,, but if a and n are not coprime, then a 
has no multiplicative inverse. 


3 Modular arithmetic 


163 


Unit A2 Number systems 


164 


We prove this important result later in this subsection, but first we 
consider how to find multiplicative inverses where they exist. Of course, we 
could do this by trial and error, or by writing out the multiplication table 
for Zn, but for large values of n these methods are very cumbersome. 
Fortunately a more efficient method exists, based on Euclid’s Algorithm. 


Euclid’s Algorithm is a method for finding the highest common factor of 
two positive integers, first described (albeit in a different form) in Euclid’s 
Elements, which dates from around 300 BCE. Given an element a of Zn, we 
can apply Euclid’s Algorithm to determine whether or not a and n are 
coprime; if they are coprime, then we know that a has a multiplicative 
inverse in Zn, but otherwise it does not. Moreover, if a and n are coprime, 
then we can use the equations that arise from applying Euclid’s Algorithm 
to work out the multiplicative inverse of a, using a method known as 
backwards substitution. 


Euclid’s Algorithm proceeds by repeatedly applying the Division Theorem, 
as in the following example. Suppose we want to find the highest common 
factor of the integers 32 and 9. We start by dividing 32 by 9, which gives 
quotient 3 and remainder 5, as in the equation 


32=3x9+5. 
Next, we divide 9 by 5, giving quotient 1, remainder 4 and the equation 
9=1x5+4. 


We continue in this way, at each step forming a new equation by dividing 
the divisor from the previous equation by the remainder from that 
equation. The complete list of equations arising from Euclid’s Algorithm 
in this example is given below. 


32=3x9+5 
9=1x5+4 
5=1x4+1 
4=4x1+0 


We stop when the remainder is 0; giving us the last equation. We always 
eventually reach this stage, because the remainders decrease by at least 1 
at each step. The remainder in the second-to-last equation is the highest 
common factor of the two integers we started with. 


So, for example, the list of equations above shows that the highest 
common factor of 32 and 9 is 1; they are coprime. We conclude that the 
number 9 does have a multiplicative inverse in Z832. 


Before we describe the process of backwards substitution and use it to find 
this inverse, let us see why Euclid’s Algorithm works. Suppose we have two 
positive integers, say a; and ag, and we apply the Division Theorem to 
obtain the equation 


a1 = qa2 + Q3, 
where 0 < a3 < ag. This equation can be rearranged as 
a3 = aj — qag. 


It follows from this rearranged equation that any integer that is a factor of 
both a, and ag (and so is a factor of a; — gaz) must also be a factor of a3. 
Thus any common factor of a, and az is also a common factor of ag and 
a3. Moreover, the unrearranged form of the equation tells us, by a similar 
argument, that any common factor of a and az must also be a common 
factor of a; and ag. It follows that the highest common factor (HCF) of a; 
and ag is equal to the HCF of ag and ag. 


So, at each stage of Euclid’s Algorithm, a pair of integers a1, a2 leads to 
another pair of integers a2,a3 with the same highest common factor. In 
the example above, we obtain the sequence of pairs 


32 and 9, 9and5, 5and4, 4and1, 1and0, 


and each pair has the same HCF. 


The final pair of integers always has second integer 0, so its HCF is its first 
integer; this is the integer, say d, that appears as the remainder in the 
second-to-last equation. Since each pair has the same HCF, it follows that 
the HCF of the original pair is also d. 


Euclid’s Algorithm is much quicker to apply than to describe! Try it for 
yourself in the next exercise. 


Exercise A96 


Use Euclid’s Algorithm to find the HCF of 201 and 81. Deduce whether or 
not the integer 81 has a multiplicative inverse in Zo}. 


If we have applied Euclid’s Algorithm to find the HCF of two positive 
integers n and a with n > a, and found that the HCF is 1, we can then use 
the list of equations obtained from the algorithm to find the multiplicative 
inverse of a in Zp using the method of backwards substitution. 


To illustrate the method, let us apply it to our example of Euclid’s 
Algorithm above: this will yield the multiplicative inverse of 9 in Z39. 


3 Modular arithmetic 


165 


Unit A2 Number systems 


166 


The first step is to rearrange each of the equations from Euclid’s 
Algorithm to make the remainders the subjects of the equations. (We do 
not need the last equation, the one with remainder 0.) This gives 

5=32-3x9 

4=9-1x5 

1=5-1x4. 
Notice that the last equation above expresses 1 as the sum of a multiple 
of 5 and a multiple of 4. (One of the multiples is negative — here the 
multiple of 4. This will always be the case because our starting integers, 
here 32 and 9, are both positive.) 


The goal of the backwards substitution process is to obtain an equation 
that expresses 1 as the sum of multiples of our original two positive 
integers, 32 and 9. (Again, one of these multiples must be negative.) 


The backwards substitution process starts with the last equation from 
Euclid’s Algorithm: 


1=5-1x4. 


Next, we use the second-to-last equation from Euclid’s Algorithm to 
substitute for the 4 in the right-hand side of this equation. We then 
simplify the resulting equation to express 1 as the sum of multiples of 9 
and 5, like this 
1=5-—(9-1x5) 
= (-1 x 9) + (2 x 5). 
Notice that, in simplifying the equation, we treat the 9 and 5 as if they 
were variables, in the same way that we would simplify the expression 
x — (y — 12) to give —ly + 2z. 
Now we repeat the process, using the third-to-last equation to substitute 
for the 5 in the right-hand side of this equation, then simplifying again to 
express 1 as the sum of multiples of 32 and 9: 
1 = (—1 x 9) +2 x (32-3 x 9) 
= (2 x 32) + (—7 x 9). 
We continue in this way, working upwards through all the equations from 
Euclid’s Algorithm. In this case, though, there are no more equations and 
we have reached our goal: an equation that expresses 1 as the sum of 
multiples of 32 and 9. 


We are now only a few short steps from finding the multiplicative inverse 
of 9 in Z32. 


First, we rearrange our final equation to obtain a multiple of the smaller of 
the two integers, 9, on the left-hand side, and a multiple of the larger 
integer, 32, together with the term +1, on the right-hand side: 


(—7) x 9 = (—2) x 3241. 
Next, we note that it follows from this equation that 


(—7) x 9 =1 (mod 32). 


Now —7 ¢ Z32, but since —7 = 25 (mod 32) and 25 € Z32, we have 

25 x 9 = 1 (mod 32), 
that is 

25 x32 9=1. 
Thus we have shown that the multiplicative inverse of 9 in Zs32 is 25; that 
is, 97} = 25 in Zg2. 
We can check this as follows: 

25 x 9 = 225 

=7x32+1, 

so 25 x 9 = 1 (mod 32), as expected. 


The next worked exercise gives another example of using this method to 
find a multiplicative inverse in a number system Zp. In this example the 
method is applied a little more efficiently. 


Worked Exercise A41 


Find the multiplicative inverse of 10 in Zə7. 


Solution 


®. Apply Euclid’s Algorithm to 27 and 10, stopping when the 
remainder 1 is obtained (since the final equation, with remainder 0, is 
not needed). & 


Applying Euclid’s Algorithm gives 


2 = 26 IO =F 7 
lO = 7+ 8 
(=2 e341. 


@. Apply backwards substitution — we can do so by mentally 
rearranging the equations above as we need them; the rearranged 
equations are 1 = 7— 2x 3,3 — 10—7 and 7—=27 —2< 10, 
Starting with the last equation, we have 
L=fS2 xs 

= 7 — 2(10 — 7) 

= —2 x 10+3x7 

= 037 2 x10) 

= <A = e x Il. 
®. This final equation expresses 1 in terms of multiples of 27 and 10, 
and can be rearranged as (—8) x 10 = (—3) x 27+1. # 


Hence 


(—8) x 10 = 1 (mod 27). 


3 Modular arithmetic 


167 


Unit A2 Number systems 


168 


Exercise A97 


Find the multiplicative inverse of 
(a) 7 in Z16; (b) 8 in Z51- 


The method demonstrated above can be used to find a multiplicative 
inverse of an element a in a number system Z, whenever a and n are 
coprime. (The condition that a and n are coprime ensures that when we 
carry out backwards substitution we have 1 on the left-hand side of the 
equation; this 1 then becomes the 1 in the congruence of the form ab = 1 
(mod n).) 

On the other hand, if a and n are not coprime, then a has no multiplicative 
inverse in Zp. To see this, suppose that a and n are not coprime. If a did 
have a multiplicative inverse, say b, in Zn, then we would have 


ab = 1 (mod n), 
that is, 
ab = kn + 1 for some integer k. 


But a and n, not being coprime, are both divisible by some integer greater 
than 1, and hence ab — kn is also divisible by this integer, which is 
impossible, since ab — kn = 1 by the equation above. 


So we have the following result. 


Theorem A11 
Let n and a be positive integers, with a in Zn. 
e If a and n are coprime, then a has a multiplicative inverse in Zp. 


e If a and n are not coprime, then a does not have a multiplicative 
inverse in Zp. 


Note that a more concise version of Theorem A11 is given in Subsection 1.4 
of Unit A3, and this is the version stated in the module Handbook. 


Theorem A11 gives us a further important result in the case when the 
modulus n is a prime number. 


Remember that a prime number (or prime) is an integer greater than 1 
whose only positive factors are 1 and itself; the first few primes are 2, 3, 5, 
7, 11, 13, 17, and 19. In contrast, a composite number is an integer 
greater than 1 that is not a prime number; the first few composite numbers 
are 4, 6, 8, 9, 10, 12, 14, 15. 


A prime number is necessarily coprime to every non-zero integer that is not 
a multiple of itself, so if p is a prime number, then every non-zero element 
of Z, is coprime to p. Thus, by Theorem A11, we have the following result. 


Multiplicative inverses in Z, 


Let p be a prime number. Then every non-zero element in Z, has a 
multiplicative inverse in Zp. 


It follows that for multiplication in Z,, where p is a prime, we can add the 
following property to the list of properties of multiplication in Zp. 


M4 Multiplicative inverses For each non-zero a € Zp where p is a 
prime number, there is a number a7! € Zp such that 


= a re 
aXpa =l=a Xpa. 


So arithmetic with +p and Xp in Zp, where p is a prime, satisfies all the 
properties A1-A5 and M1-M5; that is, for both addition and 
multiplication we have closure, associativity, an identity, inverses of all 
non-zero elements, and commutativity. Also, the distributive 

property (D1) holds for combining addition and multiplication. So, if p is a 
prime, then the number system Zp with arithmetic modulo p satisfies all 
the properties in the list of eleven properties of R that you met in 
Subsection 1.2. It also satisfies the twelfth, trivial, property mentioned 
(since the additive identity 0 and multiplicative identity 1 of Zp are not 
equal). Therefore, when p is a prime, the number system Z, with 
arithmetic modulo p is a field, like R, Q and C. 


However, the multiplicative inverses property (M4) does not hold for Zp if 
n is not prime, since in that case some elements a € Zn do not have 
multiplicative inverses. So in general the number system Zp with 
arithmetic modulo n is not a field. 


3 Modular arithmetic 


169 


Unit A2 Number systems 


3.5 Solving linear equations in Z,, 


We now return to the question of whether we can find solutions of 
equations in modular arithmetic. We consider linear equations, that is, 
equations of the form 


Xna =b; 
where a,b € Zn. We seek all solutions x € Zp. 


Linear equations a X,, £x = b where a and n are 
coprime 


First we consider the case where a and n are coprime. In this case, by 
Theorem A11, a has a multiplicative inverse a~!, and we can solve the 
linear equation above by multiplying both sides by this inverse. In the 
special case where n is a prime number, every element of Zn has a 
multiplicative inverse, so every linear equation a Xn £x = b has a solution. 


Worked Exercise A42 


Solve the equation 10 x97 x = 14. 


Note that the solution found in Worked Exercise A42 is the only solution 
of the given equation, because the multiplicative inverse of 10 in Zə7 is 
unique. 


In general, by an argument similar to that of Worked Exercise A42, if a 
and n are coprime, then the linear equation 


aXnz=b 
has the unique solution 


z=! Xp b. 


170 


Exercise A98 


Solve the following linear equations. 
(a) 7X16 r2=3 (b) 8X5, 2 = 19 


(By the solution to Exercise A97, we have 77! = 7 in Zig, and 87t = 32 
in Z51-) 


To use the method of Worked Exercise A42 to solve an equation a Xn x = b 
where a and n are coprime, we first need to find the multiplicative inverse 
in Zn of the coefficient a of x. If we have not already found this inverse 
(for example, by using Euclid’s Algorithm and backwards substitution), 
and the modulus n is fairly small, then the quickest way to solve the 
equation may be just to try different values of x. We know that there is a 
unique solution, so we can stop trying values once we have found a 
solution. Sometimes a solution can be spotted by using congruences, as in 
the following worked exercise. 


Worked Exercise A43 


Solve the equation 5 x12 £ = 7. 


Exercise A99 


Solve the following equations. 
(a) 5X13" =2 (b) 3Xy2=5 


You may spot solutions using congruences as in Worked Exercise A43, or 
you may prefer to try values, or find and use multiplicative inverses. 


3 Modular arithmetic 


171 


Unit A2 Number systems 


172 


Linear equations a Xn x = b where a and n are not 
coprime 


Recall that we are considering the question of whether we can find 
solutions in Z,, of equations of the form 

a Xn =p (5) 
where a,b € Zn. You have seen how to solve an equation of this form when 


a and n are coprime, so we now consider the case where a and n are not 
coprime. 


In this case, the equation may not have any solutions. To see this, observe 
that if equation (5) does have a solution, say c, then 


aXnc=b, 
So 

ac = b + kn for some integer k, 
which gives 

b=ac—kn. 


This equation tells us that any integer that is a factor of both a and n 
must also be a factor of b. Therefore, if equation (5) does not satisfy this 
condition — that is, if there is an integer that is a factor of both a and n 
but not a factor of b — then the equation has no solutions. In particular, if 
the highest common factor (HCF) of a and n is not a factor of b, then the 
equation has no solutions. 


For example, the equation 
6 X18 T = 4 


has no solutions, because the HCF of 6 and 18 is 3, and 3 is not a factor 
of 4. 


On the other hand, if the HCF of a and n is a factor of b, then it turns out 
that the equation always has a solution; in fact, it has d solutions, where d 
is the HCF. 


The box below summarises these facts about when the equation has 
solutions, and it also specifies what the solutions are when they exist. 


Linear equations in Z,, 


Let d be the highest common factor of the integers a and n in the 
equation 


@, him We = Op 


e If dis not a factor of b, then the equation has no solutions in Zn. 


e If dis a factor of b, then the equation has d solutions in Zn. These 
solutions are given by 


n 2n (d—1)n 
L=C, tN a ona es Sr 
where c is the solution in Z,,/q of the simpler equation 
a b 
ao S 


(Since a/d and n/d are coprime, the simpler equation has a unique 
solution, which can be found using the methods given earlier.) 


You will see a proof of the second bulleted statement in the box shortly. 
First, here is a worked exercise that illustrates the results in the box, and 
one similar exercise for you to try. 


Worked Exercise A44 


Solve the following equations. 
(a) 4X2 £ =6 (b) 60 xir =9 


Solution 

(a) The HCF of 4 and 12 is 4, but this is not a factor of 6, so the 
equation 4 x12 x = 6 has no solutions. 

(b) The HCF of 6 and 15 is d = 3, and this is also a factor of 9, so 
the equation 6 X15 z = 9 has d = 3 solutions. 


To find these solutions, we start by finding the solution of the 
simpler equation 


6 9 
= Xs db = = 
3 3 3 
that is, 
ee 


By trying possibilities, we find that the solution of this equation 
is 
p= 
Also, 15/3 = 5, so the solutions of the original equation are 
gad gS=445=]0, Sea = 14 
@. A quick check: 
6 x 4 = 24 = 9 (mod 15), 
6 x 9 = 54 = 54 — 45 = 9 (mod 15), 
6 x 14 = 6 x (—1) = —6 = 9 (mod 15), 
as expected. .©& 


3 Modular arithmetic 


173 


Unit A2 Number systems 


174 


Exercise A100 


Find all the solutions of the following equations. 
(a) 9xi2=6 (b) 8 xir =7 (c) 5 xit =2 
(d) d Xi zr= 12 (e) 3 xig £ = 13 (£) 8 X1g 2 =2 


As promised, here is a proof of the second bulleted statement in the box 
‘Linear equations in Z,,’. Before reading it, look back to remind yourself 
what this statement says. To see why it holds, let c be the solution of the 
simpler equation, as stated. Then 


a b 
| 
so 
a b n 
Aai 
for some integer k, and hence (by multiplying throughout by d), 
ac =b + kn, 
so 
aXnc=b, 


that is, c is also a solution of the original equation. 


Now consider all the integers r such that c + r is in Zn (where c is the 
solution discussed above). Let us consider the question: for which of these 
values of r is c+ r a solution of the original equation? 


Well, saying that c+ r is a solution of the original equation is equivalent to 
saying that 


aXn(c+r) =), 


which, by the multiplication property of congruences, is equivalent to 
saying that 


a Xn (C+nr)=b. 


By the distributive property for +n and Xn, the equation above is 
equivalent to 


(a Xn €) +n (a Xnr) =b. 


Now a Xn c = b (since c is a solution of the original equation), so the 
equation above is equivalent to 


b +n (a Xnr) =b, 
that is, 


axnr=O0. 


So the values of r such that c+ r is a solution of the original equation are 
the values of r such that 
ar is a multiple of n, 


or, equivalently, since both a/d and n/d are integers, 
a n 
—r isa multiple of —. 
d eae a 


Now we know that a/d is not a multiple of n/d, since these two integers 
are coprime, so the statement above holds precisely when r is a multiple of 
n/d. So our conclusion is that the element c +r of Zn is a solution of the 
original equation precisely when r is a multiple of n/d. This is what the 
second bulleted point in the box claims. (Note that the next multiple of 
n/d after (d —1)n/d is dn/d = n, and adding n to c gives a number that is 
too large to be in Zn.) 


Summary 


In this unit you have studied the properties of various different number 
systems. You have seen that Q, R, C and Z, (for p prime) all satisfy the 
eleven standard arithmetical properties you met in Subsection 1.2 
(together with the trivial twelfth property) and so are all fields, and that 
the existence of a multiplicative inverse for every non-zero element means 
that every linear equation in these number systems has a solution. You 
also saw that in number systems that are not fields, for example, in Z and 
in Zn, where n is not prime, some, but certainly not all, linear equations 
have solutions. In the field C, you saw that every polynomial equation 
with complex coefficients has a solution, and explored ways of finding such 
a solution in certain special cases. 


These number systems and their properties are used throughout the rest of 
the module. 


Learning outcomes 


After working through this unit, you should be able to: 
e understand the arithmetic properties of the rational and real numbers 


e understand the properties a number system satisfies if it is a field 
e understand and use the Factor Theorem 


e understand the definition of a complex number and represent complex 
numbers as points in the complex plane 


e perform arithmetic operations with complex numbers in Cartesian, polar 
and exponential form, and convert between these forms as appropriate 


e use de Moivre’s Theorem to find the nth roots of a complex number 


e understand the Division Theorem and the properties of congruences, and 
perform modular arithmetic 


e use Euclid’s Algorithm and backwards substitution to find multiplicative 
inverses in modular arithmetic, where these exist 


e solve linear equations in Zn. 


Learning outcomes 


175 


Unit A2 Number systems 


Solutions to exercises 


Solution to Exercise A57 


(a) There is no integer 2~! such that 
2 x 271 = 271 x 2 = 1, since - ¢ Z, for example. 


(b) Only the numbers 1 and —1 have a 
multiplicative inverse in Z. (The multiplicative 
inverse of 1 is 1, and of —1 is —1.) 


Solution to Exercise A58 


(a) (i) The equation has solution x = —2, which 
belongs to Q. 

(ii) The equation has solution x = —4, which 
belongs to Q. 


(b) (i) The equation has solution x = 3, which 
belongs to R. 


7 
ii) The equation has solution x = ——~, which 
(ii) q Fi 


belongs to R. 


Solution to Exercise A59 
(a) Factorising the equation 

a? — Te +12 =0 
gives 

(x — 3)(x — 4) =0. 


So this equation has two solutions in R, namely 
xz = 3 and z = 4. 


(b) Factorising the equation 


z? +6r+9=0 
gives 
(£ +3)? =0. 


So this equation has one solution in R, namely 
z= =: 


(c) Factorising the equation 
2a? + 5 —3 =0 

gives 
(2x — 1)(x +3) =0. 


So this equation has two solutions in R, namely 
L= - and z = —3. 


176 


(d) Applying the quadratic formula to the 
equation 
In? — Ir -1=0 


gives 


2+V/4+4+8 1 1 
= — — =- t -v3. 
į 4 7+3 
This equation has two solutions in R. 


(e) Applying the quadratic formula to the 
equation 


r? —%+5=0 
gives 
2 + /4—- 20 
r= ——. 


2 
Since 4 — 20 = —16, which is negative, this 
equation has no solutions in R. 


(£) Factorising the equation 
£? — 23r +3 =0 


gives 
2 
(« = v3) =0. 


So this equation has one solution in R, namely 


z= y3. 


Solution to Exercise A60 


(a) By the Factor Theorem (Theorem A2), £ +3 
is a factor of p(x) if and only if p(—3) = 0, that is, 
0 = (—3)Ë + k(—3)? + 6(—3) + 36 
= —27+ 9k — 18+ 36 
=9k-—9. 
This equation has just one solution, k = 1, so the 


only value of k for which x + 3 is a factor of p(x) is 
k=l, 


(b) We have 
r? + a? + 62 +36 = (2 + 3)(ax? + br + ©), 
for some real numbers a, b and c. 


Equating the coefficients of x? gives 1 = a. 
Comparing the constant terms gives 36 = 3c, so 
c= 12. Thus we have 


a? +x? + 6r + 36 = (x + 3)(a? + ba + 12). 


Equating the coefficients of x? gives 1 = 3 + b, so 
b = —2. Hence 


r? +x? + 6r + 36 = (x + 3)(x? — 2a + 12). 


Solution to Exercise A61 


(a) Since all the roots are integers, the only 
possible roots are the factors of —15, that is, 
1, +3, +5, +15. Considering these in turn, we 


obtain the following table. 
x |1 -1 3 -3 5 


—48 0 —192 0 


p(x) | 0 
We do not need to work out any more values, as 


we already have three roots: x = 1, x = 3 and 
x = 5. So, since the coefficient of x? is 1, 


1)(x — 3)(x —5). 


As a check, we note that the coefficient of x? is 
equal to minus the sum of the roots, 
9=-(14+3+5). 


(b) Let 


p(x) = 2° — 32744. 


Since all the roots of p(x) are integers, the only 
possible roots are the factors of 4, that is, 
1,+2,+4. Considering these in turn, we obtain 
the following table. 


x 1 -1 2 -2 4 —4 
p(z)}2 0 0 —16 20 —108 
Thus the only solutions are x = —1 and x = 2. 


Since p(x) is a cubic polynomial, it must have 
three linear factors, so one of these solutions must 


give rise to a repeated factor. The coefficient of x? 


is —3, and this must equal minus the sum of the 
roots. This is only possible if (x — 2) is the 


Solutions to exercises 


repeated factor; we then have —3 = —(2+2-1). 


The coefficient of x? is 1, so 


p(x) = (a — 2)(a — 2)(a + 1). 


Solution to Exercise A62 


(a) A suitable equation is 


(x — 1)(x — 2)(x — 3)(x + 3) = 0, 
that is, 
xt — 3x? — Tx? + 277 — 18 = 0. 


There are many other possibilities; for example, 
any of the factors could be repeated. 


(b) A suitable equation is 
(x — 2)(x — 2)(x — 3) = 0, 


that is, 
r? — 7x" + 16x — 12 = 0. 

Another possibility is 
(z — 2)(a — 3) (2 


3) = 0, 
that is, 
r? — 8r? + 217 — 18 = 0. 


Solution to Exercise A63 


(a) The equation z2? — 4z + 7 = 0 has solutions 
4+ /16—28 4+v-12 
~ 2 — 2 
iv/12 
2 
= 2 oe iVv3; 
that is, the solutions are z = 2+ iv/3 and 


2=2-iV3. 


(b) The equation 2? — iz + 2 = 0 has solutions 


zZ 


=2+ 


tyra. 2 wed 
Zo ~; = 3 © > 
i 3i 
J 8 
that is, the solutions are z = 2i and z = ~i. 


Unit A2 Number systems 


(c) We can factorise the equation 
2 — 327 +4z—2=0 

as 
(z—1)(az* + bz +c) =0, 


and by equating coefficients we have a = 1, c = 2 
and b = —2 giving 


(2—1)(2* — 224 2)=0. 


Hence z = 1 or 


2+ /4-8 

££ > —_ —_ = 
2 

_ 242/-1 
2 
=1+i, 


so the solutions are z= 1, z = 1 +i and z =1-—i. 

(d) The equation 24 — 16 = 0 can be factorised as 
(2? —4)(2 +4)=0 

2 


giving z? = 4 or z2? = —4, so z = +2 or z = +2i. 


Hence the solutions are z = 2, z = —2, z = 2i and 


z= —21. 


Solution to Exercise A64 


Solution to Exercise A65 
(a) (3—5i)+(2+4i)=5-—i 


(b) (2 — 3i)(—3 + 2i) = —6 + 9i + 4i — 61? 
=13i 


(c) (5+3)? = (5 + 3i)(5 + 3i) 
= 25 + 15i + 15i + 97? 


= 16 + 30i 
(d) (1+4)(74+ 2i) =7 + 7i + 2i + 21? 
=5+ 9i, 


178 


sO 
(1 +4)(74+ 24)(4— i) = (54 9i)(4 — i) 
= 20 + 36i — 5i — 97? 
= 29+ 31i. 


Solution to Exercise A66 
z = —2 -— 3i and Z = 3 +i. 


z1 =—2 + 310 - 


| 
| 
| 
| 
| 
| 
i 
| 
Z=—-2-—31¢ J 


Solution to Exercise A607 
Property 2 
Let z1 = zı + iyı and z2 = z2 + iy2. Then 
z122 = (£1 + tyr) (wo + iy2) 
= 2122+ i£2y1 + i£1Y2 + YYZ 
= (2122 — yY1Y2) + i(z2y1 + £142), 
so 
222 = (L1X2 — yry2) — i(£2y1 + x192). 
Also, 
ZI X Z3 = (x1 — iyı)(£2 — iy2) 
= 2122 — i£2yı — ivy + Pyyr 
(z1£2 — y1y2) — i(x2y1 + T142). 


Therefore 
2122 = Z1 X 22. 
Property 3 
Let z = x + iy. Then 
z+2=x2+iy+u-—ty 
= 29 
= 2 Rez. 


Property 4 Let z = x + iy. Then 
z— Z= zx + iy- (z -— iy) 
= 2iy 


= 2iImz. 


Solution to Exercise A68 


(a) |5 + 12i| = V5? + 12? 
= V169 = 13 
(b) |L +i] =VP+P=v2 
(c) |-5| = /(-5)? +0? =5 
Solution to Exercise A69 


(a} 


Here 


za- 29 =(3+i)-—(1+2i)=2-—i, 


SO 


|z = zəl = 1/2? + (—1) = V5. 


Z4.=1 


SO 


Solutions to exercises 


z1 — 29 = (—5 — 3i) — (2 — Ti) = —7 + 4i, 


SO 
lz — z| = V (17)? + # = v65. 


Solution to Exercise A70 


In each case we multiply both the numerator and 
the denominator by the complex conjugate of the 
denominator, and use 2Z = |z|?. 
ot 
(3 —7)(3 +i) 
_ s+4 
~ 32 + (-1)2 
=f+a 
= $ (3 +i) 
oOo -l-4 
43) CIFA 2) 
_ -1-2i 
BEZE 


(b) 


Solution to Exercise A71 


In each case we multiply the numerator and 
denominator by the complex conjugate of the 
denominator, and use 2Z = |z|?. 


5 -5(2+4) 

(@) 54 eNet 
= 10+5i 

-224 (—1)2 

= 4 (10 + 5i) 


=2+i 


179 


Unit A2 Number systems 


(b) 2+ 31 _ (2 + 3i)(—3 — 4i) So ¢= 7/4, and from the diagram (or because z 
—3 + 4i (—3 + 4i)(—3 — 4i) lies in the second quadrant) we have 
_ —6— 9i — 8i — 121? 0 =T — ¢ġ = 37/4. 
(=) dear Thus the polar form of —1 + i in terms of the 

__6—-1% principal argument is 

ae A 30 37 

6 it; — +isin |: 
=- {i 3 (cos ŽE + isin) 
= al z ; 
= 55 (6 172) (b) Let z = x + iy = 1 — iv3, sox = 1 and 


Solution to Exercise A72 yama 
(a) The required form is x + iy, where f 


T 
—=2cos— = 0 
x cos 5 
and 
T 
= 2sin — = 2. 
y sin 5 
The Cartesian form is therefore 27. 


(b) The required form is x + iy, where 


4 2T à 2T —v3} 
x = 4cos | -— | = 4cos — 
3 3 


Then z = r(cos 0 + isin 0), where 


= —4 cos $ = -2 

zid r= 412+ (-V3)? =2. 
2 2 
y = 4sin (-=) = —Asin = Also 
T i cos @ = Bl os 

= —4sin z = —2V3. or X 

The Cartesian form is therefore —2(1 + iV). So ¢ = 7/3, and z lies in the fourth quadrant, so 
6=-¢=-7/3. 

Solution to Exercise A73 Thus the polar form of 1 — i/3 in terms of the 


(a) Let z=x2+iy=—-1l+i,sor=—landy=1. principal argument is 


—1+i t 2 (cos (—3) + sin (-3)). 


(c) Let z = x + iy = —5, sox = —5 and y=0. 


A 


6 = 37/4 
! ^ =n 
=1 7 i Ea 


—5 


Then z = r(cos 0 + isin 0), where 
r=/(-1)?4 12 = v2. 
Also 


Then z = r(cos 0 + isin 0), where 


r= VCP FË =5. 


Also z lies on the negative half of the real axis, so 


z] 1 
Va O=. 


cos ġ = — = 
F 


180 


Solutions to exercises 


Thus the polar form of —5 in terms of the principal An argument is 
argument is 2r 


T T 
5(cos a + isin 7). 3 2 6 
Since this argument lies in (—7, 7], it is the 
Solution to Exercise A74 principal argument. The required quotient is 
(a) The modulus of the product is therefore 
T T 
4x4 =2. 6 (cos = + isin £). 
Ai r ptenig Solution to Exercise A75 
7 17 
= + = — or From the solution to Exercise A73, 
3m, 
Since this argument lies in (—7, 7], it is the z = V2 | cos E +7sin zA 
principal argument. The required product is a m 
therefore zg =2 (cos (-5) +isin (-5)) i 
Im ne Iir a 
2 cos 5 + isin = ; z3 = 5(cosm + isin T). 
Hence 
The modulus of the quotient is 3n r 
212223 = 10«/2 | cos | —=— 44 
4+1 =8. 4 3 
An argument is = (3n r 
T m 257 ante aa” 
6 8 24 ` 
inci r i i 17 17 
The principal argument is therefore ~ 10/2 (cos iim Sects in ) 
25T  y _ 230 12 12 
of Pe oF r : 
The required quotient is therefore = 10V2 (cos (-3) +7sin (-%)) ; 
8 | cos dll + isin ald ; using the principal argument. 
24 24 
Also 
(b) The modulus of the product is 2923 10 ( ( T z) 
— = — | cos ( -5 +T — — 
3x4 =}. zı V2 3 4 
An argument is +isin (-; +7 — *)) 
2n T T 7 = 
3°35 e = 5V2 (cos ( Z) + isin ( =)): 
The principal argument is therefore 
T 
CLU 27 = B 
6 


The required product is therefore 


(=) a (2) 


The modulus of the quotient is 


g Lu 
3+2=6. 


181 


Unit A2 Number systems 


Solution to Exercise A76 are 

(a) 1 = cos0 + isin0 zo = cos 0 + ¿sin 0, 

(b) If z9 = cos 0 + isin0, then, by de Moivre’s Z1 = cos i + isin > 

Theorem, on |. Om 
z3 = cos0 + isin0 = 1. Or 


If z1 = cos(27/3) +7sin(27/3), then, by 23 = COST + ISNT, 


de Moivre’s Theorem, z4 = cos 2 + isin ar 
z2 = cos 2r + isin 27 2 
20 a 27 
=cos0+7sin0 = 1. = cos (-2) +isin(-2). 
If z2 = cos(47/3) + 7sin(47/3), then, by bar 5r 
de Moivre’s Theorem, 25 = ty + isin 3 
23 = cos 4r + isin 4a T _ T 
= cos0 + isin0 = 1. ver (—3) +#sin (—3). 
(c) In Cartesian form, (b) A 
20 = 1, 
1 22 21 
2 =a (1 -iv3), 
2 
1 f 
z = —5 (1+iv3). T 
2 23 3 20 
; . —1 1 
Solution to Exercise A77 
(a) Let z = r(cos 0 + isin 8). Then, since 
1 = 1 (cos 0 + isin 0), sa E 
we have 
=r keen reine) eee: 
= 1(cos0 + isin 0). P a = A; 
Hence r = 11/6 = 1 and 0 = 0 + hee zə = —4(1—-iv3), 
k =0,1,...,5, and the six solutions of z© = 1 are C =l, 
given by z4 = —4(1 +iv3), 


Qkr . 2kr 25 = 5(1 = iv3). 
FOE a oe eh S 


6 
for k= 0,1, rð: 


Hence the solutions using the principal arguments 


182 


Solution to Exercise A78 
Let z = r(cos 0 + isin 0). Then, since 
—4 = 4(cosr + isin 7), 
we have 
2^ = rf (cos 40 + isin 40) = 4(cos m + isin r). 


2k 
Hence r = 41/4 = yZ and @ = 2+ for 


k = 0,1,2,3. 
So the solutions are 
T zr 7 z 
Zo = V2 (cos * + isin) =1+44, 


3 3 

a= v2 cos = + isin Z =-1+4+4, 
4 4 
5 5 

Z2 = v3 (cos ŽE + isin) =-1-4, 
7 7 

es = V3 (cos + isin E) =1-1 
A 

Z1 Zo 


Solution to Exercise A79 
Let z = r(cos 0 + isin 0). 
: , T ana T 
Since 8i = 8 (cos A + isin =): we have 
23 = r3(cos 36 + isin 36) 
T |. T 
=8 (cos Z + isin =) : 


2 2 
2k 
Hence r = 81/3 = 2 and @ = 7 + —* for k= 0,1,2. 


Solutions to exercises 


So the solutions are 
zo = 2 (cos Z +isin =) 
6 6 
5 


IT ei 
2A =2 toso TIP 
3T ee 
z= 2 eq, ten 


A 


ZL Zo 


Solution to Exercise A80 


(a) We have 
p(2i) = (2i)4 — 2(21)8 + 7(24)? — 8(2i) + 12 
= 16i* — 16i? + 281? — 16i + 12 
= 16+ 16: — 28 — 167 + 12 
= 0, 
so 2i is a root of p(z). 
(b) Since p has real coefficients, z = —27 is also a 
root of p(z), so (z — 24)(z + 24) = 2? + 4 is a factor 
of p(z). 
By equating coefficients, we obtain 
z4 — 223 +722 — 8z + 12 = (2? + 4)(z? — 2z + 3). 
So the remaining two roots of p(z) are the 


solutions of the equation 2? — 2z + 3 = 0. 


Using the quadratic formula, we have 


2+/4-12 


zZ = 


183 


Unit A2 Number systems 


Hence the four roots of p(z) are 2i, —2i, 1 + iv/2 
and 1 — iv/2. 


Solution to Exercise A81 


A suitable polynomial is 

(z — 1)(z + 2)(z — 3%) (z + 32), 
that is, 

(22 +z- 2)(e 49) 


or 


zt +23 +72? +9z-— 18. 


Solution to Exercise A82 
(a) Let z = x + iy; then 


1 1 
e ~ ertiy 
1 wpa 
= ———~~ ~ (by definition) 
e?” (cosy +isiny) 


=e" (cosy +isiny) t 


= e 7 (cos(—y) + isin(—y)) 
(by de Moivre’s Theorem with n = —1) 


=e~*t(-¥) (by definition) 
=E 7, 
(b) Fae x 
=e*le”* (by part (a)) 
= e1+(-%2) (by Worked Exercise A36) 
=e". 


Solution to Exercise A83 
Euler’s Identity is e" + 1 = 0; that is, —1 = et". 


We have 
—z = —1 x re” 
= e" x re? (by Euler’s Identity) 
= rél t), 


Solution to Exercise A84 


(a) 65 = 9 x 7 + 2, so the quotient is 9 and the 
remainder is 2. 


184 


(b) —256 = —20 x 13 + 4, so the quotient is —20 
and the remainder is 4. 
Solution to Exercise A85 


(a) The possible remainders are 0, 1, 2, 3, 4, 5 
and 6. 


(b) There are many possible answers here; for 
example, 3, 10, —4 and —11. 


Solution to Exercise A86 


We have 
25 = 8 (mod 17), 
53 = 2 (mod 17), 
—15 = 2 (mod 17), 
3 = 3 (mod 17), 
127 = 8 (mod 17), 


so the remainders are 8, 2, 2, 3 and 8, respectively. 
So 25 = 127 (mod 17) and 53 = —15 (mod 17). 


Solution to Exercise A87 


Suppose that a = b (mod n) and c = d (mod n). 
Then a — b = kn and c — d = In for some integers k 
and l. Hence a = b + kn and c = d + In for some 
integers k and l, so 
ac = (b + kn)(d + In) 

= bd + bln + knd + klin? 

= bd + n(bl + kd + kin). 
Therefore ac — bd = (bl + kd + kln)n. Since 
bl + kd + kln is an integer, it follows that ac = bd 
(mod n). Thus the multiplication property holds. 


Solution to Exercise A88 


(a) Using the transitivity property of congruences 
we obtain 


3869 = 669 = 29 = 13 (mod 16) 
and 
1685 = 85 = 5 (mod 16), 


so 3869 has remainder 13 on division by 16, and 
1685 has remainder 5 on division by 16. 


(b) Using the addition property of congruences 
and the answer to part (a), we obtain 


(3869 + 1685) = (13 + 5) = 18 = 2 (mod 16), 
so 3869 + 1685 has remainder 2 on division by 16. 


(c) Using the powers property of congruences and 
the answer to part (b), we obtain 


(3869 + 1685)* = 2* = 16 = 0 (mod 16), 
so (3869 + 1685)* has remainder 0 on division 
by 16; that is, (3869 + 1685)4 is divisible by 16. 
Since 

(3869 + 1685)!!! 

= (3869 + 1685)* x (3869 + 1685)1°", 

the multiplication property of congruences gives 

(3869 + 1685)!!! = 0 x (3869 + 1685)!" 

= 0 (mod 16). 

Hence (3869 + 1685)!!! has remainder 0 on division 
by 16; that is, it is divisible by 16. 


Alternatively, it is possible to conclude directly 
that (3869 + 1685)!!! is divisible by 16 (and hence 
has remainder 0 on division by 16) since it is 
divisible by (3869 + 1685)*. 


Solution to Exercise A89 
(a) 3+52=0 

(b) 4+175=9 

(c) 8+1612=4 

(d) 3x52=1 

(e) 4175 =3 

(£) 8x4612=0 


Solution to Exercise A90 


There are many ways to calculate these products 
in modular arithmetic; your method may differ 
from those below. 


(a) We have 
7 x 26=7 x (—1) 
==] 
= 20 (mod 27). 


Thus 7 X27 26 = 20. 


(b) We have 
16x 14=8x2~x 14 
=8 x 28 
= 8x (-1) 
=-8 


= 21 (mod 29). 


Thus 16 X99 14 = 21. 
(c) We have 
9x15=3x3x15 
= 3x 45 
=3x 12 
= 36 
= 3 (mod 33). 
Thus 9 x33 15 = 3. 
(d) We have 
37 x 23 = —8 x 23 
—4 x 2 x 23 
= —4 x 46 
—4x 1 
= —4 


= 41 (mod 45). 


Thus 37 x45 23 = 41. 
(e) We have 
15x6=15x2x3 
= 30x3 
=—4x3 
= —12 
= 22 (mod 34). 
Thus 15 x34 6 = 22. 
(£) We have 
9x18=9x9x2 
= 81x2 
=1x2 
= 2 (mod 40). 
Thus 9 x40 18 = 2. 


Solutions to exercises 


185 


Unit A2 Number systems 


Solution to Exercise A91 


(a) From the tables, we have the following. 
(i) 3 +43 =2, so if z +43 = 2 then z = 3. 
(ii) 4 +75 = 2, so if x +7 5 = 2 then z = 4. 
(iii) 2 +4 2 = 0, so if z +4 2 = 0 then z = 2. 
(iv) 2 +75 = 0, so if x +7 5 = 0 then z = 2. 


(b) You may have noticed that: 


e each element appears exactly once in each row 
and exactly once in each column 

e there is a pattern of diagonal stripes of unique 
numbers running down from right to left. 


Solution to Exercise A92 


(a) +6 
0 


o FWY e 


(b) «+6 1 = 5 has solution 


xz +6 5 = 1 has solution 7 = 


0 


Ae Ne oO 


5 


iL 


oF WN FR 


0 


2 


oon Fe WwW bh 


1 


3 


=. Oo nA W 


2 


4 


Neoon e 


3 


5 


5 

0 

ik 

2 

3 

4 
r= 
2: 


4. 


Solution to Exercise A93 


By definition, a +n b and b +n a are the remainders 
of the integers a + b and b + a, respectively, on 
division by n. Since ordinary addition is 
commutative, we have a + b = b + a, so 

a +n b = b +n a, and the commutative property 


(A5) holds. 


Solution to Exercise A94 
(a) a 
-a |0 65 43 21 


The additive inverse of 0 is always 0, since 


O0+n0=0. 


For any integer r > 0 in Zn, n —r € Zn and 
r+(n—r)=n,sorty(n—r) =0. 


186 


1 2 3 4 5 6 


Solution to Exercise A95 


(a) (i) The elements 1 and 3 of Z4 have 
multiplicative inverses in Z4: 1 has multiplicative 
inverse 1 since 1 x4 1 = 1, and similarly 3 has 
multiplicative inverse 3 since 3 x43 = 1. The other 
elements of Z4, namely 0 and 2, do not have 
multiplicative inverses. 

(ii) The non-zero elements of Z7 have 
multiplicative inverses as given in the following 
table, where b is a multiplicative inverse of a. 


ajl 23 4 5 6 


6|1 4523 6 


(b)xi0/0 123456789 
0/0 0000000 0 0 
1/0123456789 
2 |02 468 OD 4:68 
3 |03 69258147 
4/0 48 260 48 2 6 
5 |0 505 05 05 05 
6/0628 4062 84 
Tia 7 2185 oo 6 3 
s 0864208642 
9/098 76543 21 


The elements 1, 3, 7 and 9 of Zio have 
multiplicative inverses in Z1g, as given in the 
following table, where b is a multiplicative inverse 
of a. 


ali -3 7 9 

b1 7 3 9 
The other elements of Zio, namely 0, 2, 4, 5, 6 
and 8, do not have multiplicative inverses. 


Solution to Exercise A96 
Applying Euclid’s Algorithm gives 
201 = 2 x 81 + 39 
81 = 2 x 39+3 
39=13x3+0. 


The HCF of 201 and 81 is therefore 3. It follows 
that 201 and 81 are not coprime, and hence that 
81 does not have a multiplicative inverse in Z201. 


Solution to Exercise A97 
(a) Euclid’s Algorithm gives 


16=2x7+4+2 
7=3x2+1. 


Starting with the last equation, we have 


1=7-3x2 
= 7 — 3(16 — 2 x 7) 
=-3x16+7~x 7. 


Hence 7 x 7=3 x 16 + 1, so 7 x16 7 = 1 and 
therefore 77t =7 in Zig. 


(b) Euclid’s Algorithm gives 


51 =6x8+3 
8=2x3+4+2 
3=1x2+1. 


Starting with the last equation, we have 


1=3-2 
=3— (8—2 x 3) 
=-8+3x3 
= —8 + 3(51 — 6 x 8) 
=3 x 51 — 19 x 8. 


Hence (—19) x 8 = 1 (mod 51), but —19 + 51 = 32 
so 


32 x 8 = 1 (mod 51). 
Hence 32 x51 8 = 1, so 87! = 32 in Zs). 


Solution to Exercise A98 
(a) The given equation is 
T X16 T = 3. 


Multiplying both sides by the multiplicative 
inverse of 7 in Zig, which is 7, gives 


T X16 7X16 £ =T X163 
that is, 
LXer =r= T X63: 
Since 7 x 3 = 21 = 16 + 5, we have z = 5. 
Thus the equation 7 x16 £ = 3 has solution x = 5. 
(b) The given equation is 


8 x51 x = 19. 


Solutions to exercises 


Multiplying both sides by the multiplicative 
inverse of 8 in Z5,, which is 32, gives 


32 X51 8 X51 T = 32 X51 19, 
that is, 
1X5, 2 =2= 32 X51 19. 


Since 32 x 19 = 608 = 510 + 98 = 510 + 51 + 47, we 
have z = 47. 


Thus the equation 8 x5; x = 19 has solution 
x = AZ. 


Solution to Exercise A99 


(a) Observe that 2 = 15 (mod 13), and we know 
5 x 3= 15 so we have 


5 x 3 = 2 (mod 13). 
Hence the solution of the given equation is x = 3. 


Alternatively, 571 = 8 in Z13 (since 
5 x 8 = 40 = 39 + 1, so 5 x13 8 = 1). We have 
8 x 2 = 16 = 13 + 3, so x = 8 x132 = 3. 


(b) Observe that 5 = —6 (mod 11), and we know 
3 x (—2) = —6 so we have 


3 x (—2) = 5 (mod 11). 
The integer —2 is not an element of Z1, but 
—2 = 9 (mod 11). 
Hence the solution of the given equation is x = 9. 


Alternatively, 371 = 4 in Z4; (since 
3 x 4= 12 = 11 + 1, so 3 x11 4 = 1). We have 
4 x 5 = 20 = 11 + 9, so x = 4 x11 5 = 9. 


187 


Unit A2 Number systems 


Solution to Exercise A100 


(a) The HCF of 9 and 12 is d = 3, and this is also 
a factor of 6, so the equation 9 x12 7 = 6 has d = 3 
solutions. 

To find the smallest solution of the given equation, 
we solve the equation 


TO. 
oS 
that is, 
sxe — 2; 


By trying possibilities, we find that this equation 
has solution x = 2, since 3 x 2 = 6 and 6 = 2 

(mod 4). Also n/d = 12/3 = 4, so the other 
solutions are x = 2 + 4 = 6 and z = 2 + 2 x 4 = 10. 


(b) The HCF of 8 and 12 is 4, but this is not a 
factor of 7, so the equation 8 x12 x = 7 has no 
solutions. 


(c) The HCF of 5 and 12 is 1; that is, they are 
coprime. Hence the equation 5 x12 x = 2 has a 
unique solution. 


The solution, x = 10, can be found in various ways: 
for example 


e by noticing that 5 x 5 = 25 and 25 = 1 (mod 
12), so 57t = 5 in Zyp and therefore 
gz=57! xiz2=10 
e by spotting that 2 = —10 (mod 12), so 
5 x (—2) = 2 (mod 12), and since —2 = 10 
(mod 12) we have 5 x12 10 = 2 
e by testing possible values for x. 
(d) The HCF of 4 and 16 is d = 4, and this is also 


a factor of 12, so the equation 4 x16 £ = 12 has 
d = 4 solutions. 


To find the smallest solution of the given equation, 
we solve the equation 


4. alt 

Aa 
that is, 

1x42 =3, 


which simplifies to the solution x = 3. 
Also n/d = 12/3 = 4, so the other solutions are 


ve=34+4=7,%=3+2x4=11 and 
%©=34+3x4=15. 


188 


(e) The HCF of 3 and 16 is 1; that is, they are 
coprime. Hence the equation 3 x16 £ = 13 has a 
unique solution. 


The solution, x = 15, can be found in various ways. 
For example, you could test possible values for x: 
you would eventually find that 


3x 15=45=2~x 16413 


so 3 xış 15 = 13. Alternatively, you might spot 
that 13 = —3 (mod 16), so 


3 x (—1) = 13 (mod 16) 
which gives 
3 x 15 = 13 (mod 16), 


and hence 3 x16 15 = 13. Alternatively again, you 
might start by finding the multiplicative inverse of 
3 in Z16; a quick way to do this is to observe that 
3 x 11 = 33 = 1 (mod 16), so 371 = 11 in Zj¢. This 
gives x = 37! x16 13 = 15. 

(f) The HCF of 8 and 16 is 8, but this is not a 


factor of 2, so the equation 8 x16 x = 2 has no 
solutions. 


Unit A3 
Mathematical language and proof 


Introduction 


This unit gives an introduction to mathematical proof. While you have 
already met proofs in your previous mathematical studies, the emphasis of 
your studies is likely to have been not on proofs, but on problems that can 
be solved by, essentially, computing a result. 


In this module the emphasis shifts to a more abstract approach to 
mathematics, where the goal is to describe clearly properties of 
mathematical objects using mathematical statements, and to establish 
their correctness using proofs. 


Section 1 introduces the language used to express mathematical statements 
and reviews the ways in which statements can be combined. Sections 2 
and 3 introduce various techniques for proving that a mathematical 
statement is true. As a further introduction to abstract mathematical 
thinking, Section 4 introduces the concept of an equivalence relation on a 
set. Equivalence relations are important in many areas of mathematics. 
You will meet them again in the group theory units of this module. 


1 Mathematical statements 


In Units Al Sets, functions and vectors and A2 Number systems you have 
seen many examples of mathematical statements, theorems and proofs. In 
this section you will look in detail at mathematical statements and the 
ways in which they can be combined and negated. This sets the scene for 
Sections 2 and 3, where you will learn about methods of proof. 


1.1 Statements and negations 


The building blocks of mathematical theorems and proofs are assertions 
called statements, also known as propositions. In mathematics, a 
statement is an assertion that is either true or false, though we may not 
know which. The following are examples of statements. 


. The equation 2x — 3 = 0 has solution © = 3. 
1+1=3. 


1+34+5+---+(2n—1) =n? for each positive integer n. 


. There is a real number z such that cos g = zx. 


. Every even integer greater than 2 is the sum of two prime numbers. 


aoa fF wn 


. x is greater than 0. 


In this list, Statement 1 is true and Statement 2 is false. Statements 3 

and 4 are both true, though this is probably not immediately obvious to 
you in either case. We shall prove that Statement 3 is true later in this 
section. You can check that Statement 4 is true by noting that the graphs 
of y = cos x and y = x intersect; a rigorous proof can be obtained by using 
the Intermediate Value Theorem, which you will meet in the analysis units 
of this module. At the time of writing it is not known whether Statement 5 
is true or false; it is known as Goldbach’s Conjecture, and mathematicians 
have been trying to prove it since 1742. 


1 Mathematical statements 


191 


Unit A3 Mathematical language and proof 


pe~ RA AREE E EA 
x twa mehn lh farita wt 

REP daairibiler hg ite min Cony 

i M: Ap ai a ry aa 

ee Mie mee aier nam sr primis 


= 


ag Xi + hier. dinja 
phi gels napas 


Figure 1 Extract from 
Goldbach’s letter to Euler 


192 


On 7 June 1742 the German mathematician Christian Goldbach 
(1690-1764) posed his conjecture in a letter to Leonhard Euler 
(1707-1783). An extract from this letter is shown in Figure 1. In the 
same letter Goldbach also proposed what is now known as the Weak 
Goldbach Conjecture. This states that every odd number greater 
than 5 can be expressed as the sum of three primes. The Weak 
Goldbach Conjecture was proved by the Peruvian mathematician 
Harald Helfgott in 2013. Goldbach and Euler first met at the St 
Petersburg Academy of Sciences in 1727 when Euler was appointed to 
a position in the mathematics division, and where Goldbach was 
professor of mathematics. After Goldbach moved to Moscow in 1729 
they began a correspondence which lasted 35 years. 


Statement 6 is a little different from the others, since whether it is true or 
false depends on the value of the variable x. A statement, such as this one, 
that is either true or false depending on the value of one or more variables, 
is called a variable proposition. We usually denote statements by the 
capital letters P, Q, R, ..., and we denote variable propositions containing 
the variable x by P(x), Q(x), .... 


When considering a variable proposition, we must have in mind a suitable 
set of values from which the possible values of the variable are taken. For 
example, the set associated with Statement 6 might be R, since for each 
real number x the assertion is either true or false. A variable proposition 
with several variables may have several such associated sets. Often the set 
or sets associated with a variable are clear from the context and so we do 
not state them explicitly. In particular, unless it is stated otherwise, we 
assume that if the variable is x or y, then the associated set is R, whereas 
if the variable is n or m, then the associated set is Z or N, depending on 
the context. 


An example of an assertion that is not a mathematical statement is ‘{1, 2} 
is greater than 0’. Since {1,2} is a set, and sets cannot be greater than 
(nor less than or equal to) zero, the assertion is meaningless, and therefore 
is neither true nor false. Other examples are ‘r is interesting’ and ‘1000 is 
a large number’, which are not precise enough to be either true or false. 


1 Mathematical statements 


Exercise A101 


Determine whether each of the following assertions is a mathematical 
statement. For those that are mathematical statements, state whether or 
not they are variable propositions. 


(a) nis even or n is prime. 

(b) The set of odd integers less than 3 is small. 
(c) {1,2,3,4} is odd. 
( 


d) {1,2,3,4}1{6,7,8, 9} 42. 
(Remember that @ denotes the empty set.) 


A theorem is simply a mathematical statement that is true. However, we 
usually reserve the word for a statement that is considered to be of some 
importance, and whose truth is not immediately obvious, but instead has 
to be proved. A proposition is a ‘less important’ theorem, and a lemma 
is a theorem that is used in the proof of other theorems. A corollary is a 
theorem that follows from another theorem by a short additional 
argument. Theorems are sometimes called results. 


As you may have noticed, we have used the word proposition in two quite 
different ways in this subsection. It can either mean a ‘less important’ 
theorem, as just explained, or it can be used with the same meaning as the 
word ‘statement’ (this is its meaning in the phrase ‘variable proposition’). 
Both meanings are in common use in mathematics, so you should be aware 
of them both. Normally, the intended meaning will be clear from the 
context. 


Every statement has a related statement, called its negation, which is true 
when the original statement is false, and false when the original statement 
is true. The negation of a statement P can usually be written as ‘it is not 

the case that P’, but there are often better, more concise ways to express a 
negation. Thus, for example, the negation of the variable proposition 


x is greater than 0 
can be written as 

it is not the case that x is greater than 0, 
but is better expressed as 

x is not greater than 0 
or even as 

a<0. 


We usually denote the negation of a statement P by ‘not P’. The process 
of finding the negation of a statement is called negating the statement. 
Here are some more examples. 


193 


Unit A3 Mathematical language and proof 


194 


Worked Exercise A45 


Express concisely the negations of each of the following statements. 


(a) There are at least 10 two-digit natural numbers less than 20. 


(b) 7 is less than 5. 


Exercise A102 


Express concisely the negations of each of the following statements. 


(a) x = 2 is a solution of the equation 32 + 5 = 0. 


(b) The equation n? +n — 2 = 0 has exactly two solutions. 


In the rest of this section you will learn about the possible structures of 
mathematical statements and their negations. 


1.2 Conjunctions and disjunctions 


Statements can be combined in various ways to give more complicated 
statements. 


Inserting the word ‘and’ between two statements P and Q gives a new 
statement, called the conjunction of P and Q, which is true if both of P 
and Q are true, and false if at least one of P or Q is false. 


For example, the variable proposition 
x is greater than 0 and 2 is an integer 


is true if both of the statements ‘x is greater than 0’ and ‘x is an integer’ 
are true, and false otherwise. Thus the combined statement is true if x = 4 
but false if x = 3.5. 


It is sometimes necessary to rephrase a statement to recognise that it is a 
conjunction. For example, a statement of the form ‘P but not Q’ is in fact 
the conjunction ‘P and not Q’. Thus, the statement 


2 is prime but it is not odd 
can be treated as the conjunction ‘2 is prime and 2 is not odd’. 


Inserting the word ‘or’ between two statements P and Q also gives a new 
statement, the disjunction of P and Q, which is true if at least one of P 
or Q is true, and false if both of P and Q are false. Thus, the word ‘or’ is 
used in its inclusive sense in mathematical statements: ‘P or Q’ means 
‘either P or Q or possibly both’. 


For example, the variable proposition 
x is greater than 0 or x is an integer 


is true if at least one of the statements ‘x is greater than 0’ and ‘x is an 
integer’ is true, and false otherwise. Thus this combined statement is true 
if x = 4, x = 3.5 or x = —4 but false if x = —3.5. 


Just as for conjunctions, it may be necessary to rephrase a statement to 
recognise that it is a disjunction. For example, statements of the form ‘at 
least one of P or Q holds’, or ‘either P, or Q’, are different ways of 
expressing the disjunction ‘P or Q’. So the statement 


at least one of m or n is odd 
can be treated as the disjunction 


m is odd or n is odd. 


Negating conjunctions and disjunctions 


Since the statement ‘P and Q’ is true exactly when both P and Q are true, 
its negation is true when at least one of P or Q is false. Thus the negation 
of ‘P and Q’ is the statement ‘not P or not Q’. 


1 Mathematical statements 


195 


Unit A3 Mathematical language and proof 


196 


Worked Exercise A46 


Negate the following conjunctions. 


(a) n is positive and p is prime 
(b) The sets A and B are both empty. 


(c) pis an odd prime. 


Solution 


(a) 


®. This statement is false when at least one of ‘n is positive’ and 
‘p is prime’ is false. © 


The negation is 

n is less than or equal to 0, or p is not prime. 
®. This statement can be expressed as 

The set A is empty and the set B is empty. 


The statement is false when at least one of ‘the set A is empty’ 
and ‘the set B is empty’ is false. & 


The negation is 

The set A is non-empty or the set B is non-empty. 
©. This statement can be expressed as 

p is odd and p is a prime. 


The statement is false when at least one of ‘p is odd’ and ‘p is a 
prime’ is false. ® 


The negation is 


p is even or p is not prime. 


Similarly, the negation of ‘P or Q’ is true exactly when both of P and Q 
are false; that is, exactly when ‘not P and not Q’ is true. A little thought 
and some examples should convince you of this. 


Worked Exercise A47 


Negate the following disjunctions. 


(a) Either m or m + 1 is even 
b) 2>0Dory>0 
(c) Either A= Bor AN B = Ø. 


1 Mathematical statements 


Exercise A103 


Express concisely the negations of each of the following statements. 
(a) Both x and y are integers. 

(b) The integer m is even but the integer n is odd. 

(c) At least one of the integers m or n is odd. 

(d) Either A= Ø or B = Ø. 


1.3 Implications 


Many mathematical statements are of the form ‘if something, then 
something else’, for example: 


if x > 2, then 27 > 4. 


This type of statement is called an implication. An implication is made 
up of two statements, which in the example above are ‘x > X and ‘z? > 4’, 
and can be expressed by combining these statements using the words ‘if’ 
and ‘then’. In an implication ‘if P, then Q’, the statement P is called the 
hypothesis of the implication, and the statement Q is called the 
conclusion. 


197 


Unit A3 Mathematical language and proof 


198 


It is important to be clear about exactly what an implication asserts. The 
statement above asserts only that if you know that x > 2, then you can be 
sure that x? > 4. It does not assert anything about the truth or falsity of 

‘7? > 4’? when z is not greater than 2. In general, the implication 


if P, then Q 


asserts that if P is true, then Q is also true; that is, that it cannot happen 
that P is true and Q is false. The implication does not assert anything 
about the truth or falsity of Q when P is false. 


If x is a real variable, then the statement 
if x > 2, then z? >4 


is true because for every real number x for which ‘x > 2’ is true, ‘x? > 4’ is 
also true. Strictly speaking, this statement should be expressed as 


for all x € R, if x > 2, then x? > 4. 


However, it is conventional to omit the initial ‘for all x € R’, and interpret 
the statement as if it were there. We adopt this convention throughout 
this module (indeed, it is used in almost all mathematical texts), so 
statements of the form ‘if P, then Q’, where P and/or Q are variable 
propositions, should be interpreted as applying to all possible values of the 
variables in the statements P and Q. 


An implication does not have to be expressed using the words ‘if’ and 
‘then’ — there are many other ways to convey the same meaning. The 
left-hand side of the table below lists some ways of expressing the 
implication ‘if P, then Q’. The right-hand side gives examples for the 
particular implication ‘if z > 2, then x? > 4’. 


Ways of writing Ways of writing 
‘if P, then Q’ “f x > 2, then z? > a 
P implies Q x > 2 implies z? > 4 
P = Q n>? = r >4 
P is sufficient for Q x > 2 is sufficient for x? > 4 
P only if Q x > 2 only if z? > 4 
Q whenever P x? > 4 whenever x > 2 
(or: x? > 4, for all x > 2) 
Q follows from P x? > 4 follows from x > 2 
Q is necessary for P x? > 4 is necessary for x > 2 
Q provided that P x? > 4 provided that x > 2 


The symbol ==> is read as ‘implies’, and it is commonly used in 
mathematical notation. The form ‘P only if Q’ may seem strange at first; 
it asserts that the only circumstance in which P can be true is if Q is also 
true. In other words, if P is true, then Q must also be true — that is, 

P implies Q. 


1 Mathematical statements 


The notation = > was first used by Nicolas Bourbaki in 1954. 
Nicolas Bourbaki was the pseudonym for a group of (mainly French) 
mathematicians who from 1935 over a period of thirty years produced 
an influential series of textbooks that were designed to present all of 
pure mathematics in a completely structured and axiomatic way. The 
name Bourbaki derives from that of a nineteenth-century French 
general, Charles Bourbaki, and was adopted by the group as a 
reference to a prank lecture by a student. 


The founders of the Bourbaki 


The next exercise is for you to practise working with implications. In Sroup 
Section 2 you will see how to formally prove or disprove statements like 

those in parts (b) and (c). Whether the statement in part (a) is true or 

false can be established by algebraic manipulation. 


Exercise A104 


Rewrite each of the following statements in the form ‘if P, then Q’. In 
each case, state whether you think the implication is true. You are not 
asked to justify your answers. 

a) z? —2z+1=0 = (z-1)} =0. 


b) Whenever n is odd, so is n3. 


( 
( 
(c) Every integer that is divisible by 3 is also divisible by 6. 
(d) x > 2 only if x > 4. 

( 


e) x? <0 provided that x < 0. 


Many theorems have statements of the form ‘Let P. Then Q’. This is an 
alternative way to express a theorem of the form ‘if P, then Q’. You have 
already met several theorems stated in this form — for example, the 
Division Theorem (Theorem A9 in Unit A2). 


Theorem A9 Division Theorem 


Let a and n be integers, with n > 0. Then there are unique integers q 
and r such that 


a=qn+r, withO<r<n. 


199 


Unit A3 Mathematical language and proof 


200 


The theorem could be restated as follows. 


Theorem A9 Division Theorem (version 2) 
If a and n are integers, with n > 0, then there are unique integers q 


and r such that 


a=qn+r, withO<r<n. 


The negation of an implication 


Contrary to what you might expect, the negation of an implication is not 
another implication — rather, it is a conjunction. To see why, it might help 
to think about the implication 


if P, then Q 
as asserting 

it is not the case that P is true and Q is false. 
Thus, negating the implication is equivalent to asserting that it is the case 
that P is true and Q is false, which is the conjunction 

P and not Q. 

A non-mathematical example might be helpful here. Consider the 
statement 


If it snows before the next train to London is due to leave, then the 
next train to London gets cancelled. 


If you want to negate this statement, you need to think about what has to 
happen in order for it to be false: that occurs if it snows before the next 
train to London is due to leave and the train leaves anyway. So the 
negation is 

It snows before the next train to London is due to leave, and the 

next train to London does not get cancelled. 
You will need to work with negations of implications when you meet 
counterexamples later in Section 2, so it will help to practise negating 
implications with mathematical content. This is the topic of the next 
worked exercise and exercise. 
Here and later in the unit, we sometimes use brackets to avoid ambiguity 
when the conclusion of an implication is a conjunction or a disjunction. 
For example, in the implication 

if the product mn is odd, then (m is odd and n is odd), 


the conclusion is the conjunction ‘m is odd and n is odd’. Enclosing this 
conclusion in brackets eliminates any possible confusion with the 
conjunction of the implication 


if the product mn is odd, then m is odd 


and the statement ‘n is odd’. 


1 Mathematical statements 


Worked Exercise A48 


Write down the negations of each of the following implications. 


(a) If m is odd, then m? is even. 
(b) If m divides 12, then (m divides 3 or m divides 4). 


Exercise A105 


Write down the negations of each of the following implications. 
(a) If m and n are odd, then m + n is odd. 
(b) If A= Ø, then (AU B = Ø or B — A = Ø). 


(For part (b), remember that B — A denotes the set of elements of B that 
are not elements of A.) 


201 


Unit A3 Mathematical language and proof 


202 


The converse of an implication 


Given any implication, we can form another implication, called its 
converse. The converse of the implication ‘if P, then Q’ is the implication 


if Q, then P. 

For example, the converse of the implication 
if x > 2, then z? > 4 

is 
if £? > 4, then x > 2. 


In this example, the original implication is true, and its converse is false 
(to see that the converse is false consider, for example, x = —3). It is also 
possible for an implication and its converse to be both true, or both false. 
In other words, knowledge of whether an implication is true or false tells 
you nothing at all about whether its converse is true or false. You should 
remember this important fact whenever you read or write implications. 


To help you remember these facts about implications, you may again find 
it helpful to consider non-mathematical examples. For example, the 
implication 


if Rosie is a sheep, then Rosie is less than two metres tall 
is true, but its converse, 
if Rosie is less than two metres tall, then Rosie is a sheep, 


certainly is not true! 


Exercise A106 


For each of the following implications about integers m and n, write down 
its converse and state whether you think the implication, its converse or 
both are true. You are not asked to justify your answers. 


(a) If m and n are both odd, then m +n is even. 


(b) If one of the pair m,n is even and the other is odd, then m +n is odd. 


The contrapositive of an implication 


Given any implication, we can form a further implication, called its 
contrapositive. Unlike the converse, the contrapositive is equivalent to 
the original implication. The contrapositive of the implication ‘if P, then 
Q’ is 

if not Q, then not P. 


1 Mathematical statements 


For example, the contrapositive of the implication 


if x is an integer, then z? 


is an integer 
is the implication 
if z? is not an integer, then x is not an integer. 


You can think of an implication and its contrapositive as asserting the 
same thing, but in different ways. You should take a few moments to 
convince yourself of this in the case of the example just given. 


Try this also with the following non-mathematical example. The 
contrapositive of the implication 


if Rosie is a sheep, then Rosie is less than two metres tall 
is 

if Rosie is not less than two metres tall, then Rosie is not a sheep, 
or, more simply, 

if Rosie is at least two metres tall, then Rosie is not a sheep. 


Contrapositive implications are a key ingredient of an important method of 
proof, proof by contraposition, that you will meet in Subsection 3.2. For 
now, it is important to remember the distinction between the converse of 
an implication, which is not equivalent to the implication, and its 
contrapositive, which is. A little practice should help you remember this 
distinction. 


Exercise A107 


For each of the following implications about integers m and n, write down 
its converse and its contrapositive and state whether you think the 
converse, the contrapositive or both are true. You are not asked to justify 
your answers. 


(a) If the product mn is even, then at least one of m or n is even. 


(b) If q divides the product mn, then (q divides m or q divides n). 


1.4 Equivalences 
The statement 
if P, then Q, and if Q, then P, 


which asserts that the implication ‘if P, then Q’ and its converse are both 
true, is usually expressed more concisely as 


P if and only if Q. 


Recall that ‘P if Q’ means ‘Q = > P’, and ‘P only if Q’ means 
‘P => Q, so the phrase ‘if and only if’ is rather natural in this context. 


203 


Unit A3 Mathematical language and proof 


204 


If the statement ‘P if and only if Q’ is true, then P and Q are either both 
true or both false — in other words, if either one of P or Q is true, then so 
is the other. 


Here are two examples: 

1. n is odd if and only if n? is odd 

2..¢> 2 if and only if z? > 4. 

Statements like these are called equivalences. 


Equivalence 1 above is true because both the implications ‘if n is odd, then 
n? is odd’ and ‘if n? is odd, then n is odd’ are true. However, equivalence 2 
is false because the implication ‘if x? > 4, then x > 2’ is false. 


You have met equivalences before. For example, the Factor Theorem 
(Theorem A2 in Unit A2) contains an equivalence in its statement. 


Theorem A2 Factor Theorem (in R) 


Let p(x) be a real polynomial, and let a € R. Then p(a) = 0 if and 
only if x — a is a factor of p(x). 


As with implications, there are many different ways to express 
equivalences. The table below lists some ways in which this can be done, 
with illustrations using example 1 above. 


Ways of writing Ways of writing 

‘P if and only if Q’ ‘n is odd if and only if n? is odd’ 
P Q n is odd <— + n? is odd 

P is equivalent to Q n is odd is equivalent to n? is odd 
P is necessary and sufficient for Q n is odd is necessary and sufficient 


for n? to be odd 


The symbol <= is commonly used to denote equivalences. It is usually 
read as ‘if and only if’, or sometimes as ‘is equivalent to’. 


It is important to remember that the symbol <= denotes equivalence 
between statements rather than equality between expressions, and should 
never be used in place of =. For example, it is incorrect to write 

x?—1 <> (x+4+1)(x—1), but correct to write either 


r? —1=(z+1)(z-— 1), 


r? —1=0 4 (z+1)(z-1)=0. 


1 Mathematical statements 


Exercise A108 


For each of the following equivalences about integers, write down the two 
implications that it asserts, state whether you think each implication is 
true and hence state whether you think the equivalence is true. You are 
not asked to justify your answers. 


(a) The product mn is odd if and only if both m and n are odd. 


(b) The product mn is even if and only if both m and n are even. 


In some cases, it is helpful to think of the equivalence P <= > Q in terms 
of P = > Q and the implication ‘not P = > not Q’. Recall that 

Q = > P is equivalent to its contrapositive ‘not P = > not Q’. 
Therefore, since P = > Q asserts that both P = Q and Q = P 
hold, an alternative way to express the equivalence is to assert both 


P — Q and (not P => not Q). 
For example, the equivalence 
m is even <> m? is even 
can be expressed as 
(m is even => m°? is even) and (m is odd = m? is odd). 


Theorem A11 in Unit A2 contains an equivalence stated in this form. 


Theorem All 
Let n and a be positive integers, with a in Zp. 
e If a and n are coprime, then a has a multiplicative inverse in Zn. 


e If a and n are not coprime, then a does not have a multiplicative 
inverse in Zn. 


Thus, this theorem can be stated more succinctly as follows. 


Theorem A11 (version 2) 


Let n and a be positive integers, with a in Zn. Then a has a 
multiplicative inverse in Zp if and only if a and n are coprime. 


We will prove this theorem in Subsection 3.1. 


205 


Unit A3 Mathematical language and proof 


Although a mathematical statement should normally be interpreted as 
meaning precisely what it says — no more and no less — there is one 
common exception to this rule. When giving a definition, we usually write 
‘if? when we really mean ‘if and only if’. You have seen many examples of 
definitions in this form throughout Units Al and A2 — below are two 
specific ones. 


Definition 


A set A is a subset of a set B if each element of A is also an element 
of B. 


This definition (from Subsection 2.5 of Unit A1) is really stating that the 
two statements 


A is a subset of B 
and 
each element of A is also an element of B 


are equivalent. Below is the second example (from Subsection 3.2 of 
Unit A2). 


Definition 
Let n be a positive integer. Two integers a and b are congruent 


modulo n if a— b is a multiple of n; that is, if a and b have the same 
remainder on division by n. 


Again, this definition is really saying that the statements ‘a and b are 
congruent modulo n’ and ‘a — b is a multiple of n’ are equivalent. 


1.5 Universal and existential statements 


Many mathematical statements include the phrase ‘for all’, or another 
expression with the same meaning. Here are a few examples. 


1. x? > 0 for all real numbers z. 

2. Every multiple of 6 is divisible by 3. 

3. 1+3+5+---+(2n—1) =n? for each positive integer n. 
4. Any rational number is a real number. 


Statements of this type are known as universal statements, and the phrase 
‘for all’, and its equivalents, are referred to as the universal quantifier. 


The universal quantifier is sometimes denoted by the symbol V; for 
example, the first universal statement above might be abbreviated as 


Va € R, z? > 0, 


which is read as ‘for all x in R, x squared is greater than or equal to zero’. 


206 


3 


Statements that begin with a phrase like ‘There are no...’ or ‘There does 
not exist ...’ are universal statements because they can be rephrased in 
terms of ‘for all’. For example, the statement 


there is no integer n such that n? = 3 
can be rephrased as 
for all integers n, n? Æ 3. 


In Subsection 1.3 you met an important class of universal statements. 
Recall that implications of the form 


P(t) => Q(), 


should strictly be expressed as ‘for all x, if P(x), then Q(x)’, but the initial 
‘for all x’ is generally omitted by convention. So implications where the 
hypothesis and the conclusion are variable propositions are in fact 
universal statements where the universal quantifier is omitted. For 
example, the statement 


if n is a multiple of 6, then n is a multiple of 3 
means in fact 
for all integers n, if n is a multiple of 6, then n is a multiple of 3. 


We now turn to another type of statement with a quantifier. Some 
mathematical statements include the phrase ‘there exists’, or another 
expression with the same meaning. Here are a few examples. 


1. There exists a real number that is not a rational number. 
2. There is a real number «x such that cosa = zx. 

3. Some multiples of 3 are not divisible by 6. 

4. The equation x? + z? + 5 = 0 has at least one real solution. 


Statements of this type are known as existential statements, and the 
phrase ‘there exists’ and its equivalents are referred to as the existential 
quantifier. 


In the third example, the word some is used to mean ‘at least one’, rather 
than several. It is important to remember that this is the standard 
mathematical usage of ‘some’. 


The existential quantifier is sometimes denoted by the symbol J; for 
example, the second existential statement above might be abbreviated as 


Jx € R such that cosg = z, 


which is read as ‘there exists x in R such that cos æ equals x’. 


1 Mathematical statements 


207 


Unit A3 Mathematical language and proof 


Giuseppe Peano 


Bertrand Russell 


Gerhard Gentzen 


208 


The symbol 3 was introduced by Giuseppe Peano (1858-1932) in 1897 
and was used by Bertrand Russell (1872-1970) and 

Alfred North Whitehead (1861-1947) in their monumental 

Principia Mathematica (1910-1913). In 1935 Gerhard Gentzen 
(1909-1945) introduced the V symbol. He called it the All-Zeichen (all 
character), in analogy with 4 which Gentzen said he borrowed from 
Russell. 


In natural language, the word ‘any’ can mean either ‘every’ or ‘at least 
one’, as in ‘any fool could do that’ and ‘did you win any prizes?’. In 
mathematics, the meaning depends on the context in a similar way. We try 
to avoid using ‘any’ where it might cause confusion. 


As already mentioned, it is often necessary to negate statements; for 
example, this is the case when we consider proof by contradiction or proof 
by contraposition, which you will meet in Section 3. The negation of 
universal and existential statements needs to be treated with particular 
care. The negation of a universal statement is an existential statement, 
and vice versa. This is illustrated by the examples in the table below. 


Statement Negation 


Every integer is a real number. There exists an integer that 
is not a real number. 


There is an even prime number. Every prime number is odd. 
The equation z? + 4 = 0 has a For all real numbers z, 
real solution. r? +4£0. 


Exercise A109 


Express concisely the negations of each of these statements. 
(a) There is a real number g such that cosg = zx. 
(b) There exists an integer that is divisible by 3 but not by 6. 


(c) Every real number z satisfies the inequality x? > 0. 


You have now met the negations of a number of different types of 
statements, and to conclude this section we collect them together in the 
table below. 


Statement Negation 

P not P 

PandQ not P or not Q 

PorQ not P and not Q 

If P, then Q P and not Q 

For all x, P There exists an x such that not P 
There exists an x such that P For all x, not P 


2 Direct proof 


The aim of this section and the next is to make you more familiar with the 
structures of various different types of mathematical proof. This section 
deals with direct methods of proof — that is, methods of proof which 
involve a series of logical steps leading from known facts and assumptions 
directly to the statement you wish to prove. In the next section you will 
consider indirect methods of proof. 


Working through proofs, producing your own proofs and critically 
assessing mathematical arguments should help you to express your own 
mathematical thoughts and ideas more clearly. 


In this module the proofs that you are asked to produce are simpler than 
many of the ones that are provided for you to read. Do not be discouraged 
if proof writing seems difficult at first: it is a skill that is acquired 
gradually. Working through the proofs that you meet is probably the most 
useful preparation. It is also important to study the more complex proofs 
that appear later in the module, and understand why they prove the 
statements that they claim to prove. 


A proof of a mathematical statement is a logical argument that establishes 
that the statement is true. Here is a simple example. 


Worked Exercise A49 


Prove the following statement. 


If n is an odd number between 0 and 10, then n? is also odd. 


Solution 


The odd numbers between 0 and 10 are 1, 3, 5, 7 and 9. The squares 
of these numbers are 1, 9, 25, 49 and 81, respectively, and these are all 
odd. 


2 Direct proof 


209 


Unit A3 Mathematical language and proof 


210 


In the example above, there were only a small number of possibilities to 
consider, so it was easy to prove the statement by considering each one in 
turn. This method of proof is known as proof by exhaustion because we 
exhaust all possibilities. In contrast, it is not possible to prove the 
statement ‘If n is an odd number, then n? is also odd’ using proof by 
exhaustion because there are infinitely many possibilities to consider. Most 
mathematical statements that you will come across cannot be proved by 
exhaustion because there are too many possibilities to consider — usually 
infinitely many. Instead we must supply a general proof. 


As an initial example of a general proof, we state and prove a result that 
applies to expressions of the form a” — b”. These expressions occur often in 
calculations, and you have probably already met the factorisations 


a — b? = (a —b)(a +b), 
aè — b? = (a — b) (a? + ab + b°), 
af — bt = (a — b) (a? + a°b + ab? + b°), 


and so on. The following general result can be proved by multiplying out 
the expression on the right-hand side. 


Theorem A12 Geometric Series Identity 
Let a,b € R and let n be a positive integer. Then 


a” = SSG a o o t a a), 


Proof Expanding the right-hand side of the equality gives 
(a—b)(a™ 1 +a" *b4---+ ab? +54) 
=a” +a" 1b +--- +475"? + ab”! 
— (a 1b +a" 7b? +--- + ab"! +b") 
=a"—b", 


as required. E 


The structure of the argument required in most proofs goes beyond the 
kind of algebraic verification used here. In the rest of this section, you will 
see how different techniques can be used for proofs that require arguments 
with a more complex structure. 


2 Direct proof 


2.1 Proving implications 


Suppose we wish to prove that the implication P = > Q is true. We have 
to prove that whenever the statement P is true, the statement Q is also 
true. Often the best way to do this is to start out by assuming that P is 
true, and proceed as follows. If we know that the statement 


P == Pi 


is true for some statement P,, then we can deduce that Pı is also true. 
Similarly, if we know that the statement 


P= P 


is true for some statement P2, then we can deduce that Pz is also true. In 
this way we can build up a sequence of statements 


Po a te oyes 


each of which we know to be true under the assumption that P is true. The 
aim is to build up such a sequence 


P, ote teeta ni 
which leads to Q. If this can be achieved, then we have a proof of the 


implication P = > Q. Here is an example. 


Worked Exercise A50 


Prove that if n is odd, then n? is odd. 


211 


Unit A3 Mathematical language and proof 


212 


In the proof in Worked Exercise A50, statement P is ‘n is odd’, and we 
start by assuming that this is true. Assumptions are generally introduced 
by words such as ‘let’, ‘suppose’ or ‘assume’. Statement P; is ‘n = 2k +1 
for some integer k’, and so on. We use words like ‘then’ and ‘hence’ to 
indicate that one statement follows from another. The string of equalities 


n? =.. = 2(2k? + 2k) +1 


in the proof can be regarded either as a sequence of three statements, 
namely 


n? = (2k +1), 
n? = (2k)? + 2(2k) +1, 
n? = 2(2k? + 2k) + 1, 
or as a single statement asserting the equality of all four expressions. 


Many of the true statements about odd and even integers that appeared in 
the exercises in the last subsection can be proved using ideas similar to 
those of the proof in Worked Exercise A50; that is, we write an odd integer 
as 2 x some integer + 1, and an even integer as 2 x some integer. (Similarly, 
we can often prove statements about multiples of 3 by writing each such 
number as 3 x some integer, and so on.) Here is another example. 


Worked Exercise A51 


Prove that the sum of two odd integers is even. 


We have seen that a sequence P, Pi, P2,...,Pn,Q of statements forms a 

proof of the implication P = > Q provided that each statement is shown 
to be true under the assumption that P is true. In Worked Exercises A50 
and A51, each statement in the sequence was deduced from the statement 


immediately before, but the sequence can also include statements that are 
deduced from one or more statements further back in the sequence, or 
statements that we know to be true from our previous mathematical 
knowledge. This is illustrated by Worked Exercise A52 below. 


A fact that you may already know, which will be useful in Worked 
Exercise A52 and also later in this section, is that every integer greater 
than 1 has a unique expression as a product of prime numbers. For 
example, 6468 = 2 x 2x 3x 7 x 7 x 11, and this is the only way to express 
6468 as a product of primes (except of course that we can change the order 
of the primes in the expression — the expression is unique up to the order 
of the primes). This fact is known as the Fundamental Theorem of 
Arithmetic. 


Theorem A13 Fundamental Theorem of Arithmetic 


Every integer greater than 1 can be written as a product of prime 
numbers. The factorisation is unique up to the order of the factors. 


It is certainly not obvious that the Fundamental Theorem of Arithmetic is 
true! However, a proof is beyond the scope of this module. 


Worked Exercise A52 


Prove that for every integer n, the number n? + 3n? + 2n is divisible by 6. 


The next exercise gives you practice in the techniques that you have seen 
so far in this subsection. 


2 Direct proof 


213 


Unit A3 Mathematical language and proof 


214 


Exercise A110 


Prove each of the following implications. 


2 is even. 


(a) Ifn is an even integer, then n 
(b) If m and n are multiples of k, then so is m +n. 

(c) If one of the pair m,n is odd and the other is even, then m +n is odd. 
( 


d) If n is a positive integer, then n? + n is even. 


If a proof of an implication is particularly simple, and each statement in 
the sequence follows directly from the one immediately before, then we 
sometimes present the proof by writing the sequence of statements in the 
form 


P Pi P> P3 Er Py Q. 


This notation indicates that each of the statements P => P, P => P, 
..., Pa = > Q is true. It is particularly appropriate for proofs that 
depend mostly on algebraic manipulation. Here is an example. 


Worked Exercise A53 


Prove that if x(a — 2) = 3, then z = —1 or z = 3. 


Solution 
1e—2)\=3 = 7 = 3 = 0 
= (x+1)(z-3)=0 
= ul SN E 
= aS o a E= 


It is worth noting that Worked Exercise A53 does not ask us to solve the 
equation, but, rather, to prove the implication 


r(x- 2)=3 = zr=-lorzr=3. 


By proving this implication, we showed that —1 and 3 are the only 
possibilities for solutions of the equation x(a — 2) = 3. We did not show 
that —1 and 3 actually are solutions, since for that it is necessary to prove 
also that if x = —1 or x = 3, then x(x — 2) = 3, that is, the converse of the 
given implication. Thus, strictly, we have not solved the equation! 
Whenever we solve an equation, an implication and its converse must both 
be proved; in other words, we need to prove an equivalence. We will do 
this for the equation in Worked Exercise A53 in the next subsection (see 
Worked Exercise A55). 


Even though proofs that depend on algebraic manipulation are among the 
easiest to produce, they still require care, as the following example shows. 


Worked Exercise A54 
Explain why the following proof that 
4? =r => r= ; 


is incorrect. 


Claim (incorrect!) 


If 4z? = x, then x = 


Proof (incorrect!) 
4r? =£ => 4r=1 
= S - . 


Worked Exercise A54 requires a skill that is also helpful in writing proofs, 
namely the ability to evaluate arguments critically. The next two exercises 
give you further practice at spotting mistakes in deductions. 


In the first of these exercises, and elsewhere later in the unit, you will meet 
an argument that involves rearranging an inequality. You should be 
familiar with the rules for rearranging inequalities from your previous 
mathematical studies, but if you need to refresh your memory, the rules 
are listed in the module Handbook. Inequalities are especially important in 
analysis, so you will study them more formally in the first of the analysis 
units, Unit D1 Numbers. 


2 Direct proof 


215 


Unit A3 Mathematical language and proof 


216 


Exercise A111 


Explain why the following deduction that « = —1 from the assumption 
x < —1 is incorrect. 


Claim (incorrect!) 


If x < —1, then z = —1. 


Proof (incorrect!) We have that z < —1 = (x+1)* <0, 
because 


r<-1 = r+1<0 = wei <0. 


However, (x + 1)? is the square of a real number, and a square can 
never be negative. Hence the only possibility is x + 1 = 0, that is, 
x=-l. 


Therefore, if x < —1, then z = —1. 


In the next exercise you are asked to evaluate an incorrect argument that 
claims to prove a correct statement. 


Exercise A112 


Consider the following statement. 
If z = 142i and z = V3 — iv2, then |z| = |z2l. 


Explain why the argument below is not a correct proof of this statement 
and write a correct proof. 


Proof (incorrect!) 
lal = lz] = jal? = [22]? 
= 42? = (V3) + (- v9? 
= 144=342 
= 5=5. 


Therefore |z1| = [z2]. 


The incorrect proof in Exercise A112 shows a common proof pitfall: it is 
important to remember that assuming the statement P to be proved and 
using it to deduce a statement that is known to be true provides no 
information at all about the truth of P. Here is an archetypal example of 
this kind of incorrect argument. 


Example (incorrect!) 


In this example the conclusion 1 = 1 is true, and each step in the 
deduction is valid, but the original statement, 1 = —1, is most definitely 
false! As you learned in Subsection 1.3, an implication P == Q does not 
give any information about the truth or falsity of Q when P is false. 


2.2 Proving equivalences 


We now discuss how to prove equivalences. Recall that the equivalence 
‘P if and only if Q’ asserts that both the implication ‘P = > Q’ (‘P only 
if Q’) and its converse ‘Q = > P’ (‘P if Q’) are true. The best way to 
prove ‘P if and only if Q’ is usually to tackle each implication separately. 
However, if a simple proof of one of the implications can be found, in 
which each statement follows from the one before, then it is sometimes 
possible to ‘reverse all the arrows’ to obtain a proof of the converse 
implication. That is, if you have found a proof of the form 


P Pi Pz Ps os Pn Q, 


then you may find that also each of the following implications is true: 


Q P, om P> P, P, P. 


In this case you may be able to present the proofs of both implications at 
once, by writing 


P Py P» Ps ri = PS Q. 


As with implications, this is particularly appropriate for proofs that 
depend mostly on algebraic manipulation. The next worked exercise gives 
a proof of this type showing that the implication in Worked Exercise A53 
and its converse are both true. Remember that the symbol <=> is the 
one to use when solving equations or inequalities. 


2 Direct proof 


217 


Unit A3 Mathematical language and proof 


218 


Worked Exercise A55 


Prove that x(x — 2) = 3 if and only if x = —1 or qz = 3. 


Solution 
gle —2)=3 => 2? 97 —3=0 
<— (£+1)(x— 3) =0 
eS gtl=Olor—3s=0 
= f= Gf f= oa; 


In Worked Exercise A55 we solved the equation x(x — 2) = 3: we showed 
that its solution set is {—1,3}. The forward (=> ) part of the proof shows 
that if x satisfies x(a — 2) = 3, then z = —1 or x = 3; in other words, these 
are the only possible solutions of the equation. This is what we proved 
earlier in Worked Exercise A53. The backward ( <= ) part shows that if 
x =—lor x = 3, then z satisfies x(a — 2) = 3; in other words, these two 
values actually are solutions of the equation (note that if you were asked to 
prove only that « = —1 and x = 3 are solutions, and not that they are the 
only solutions, then it would be more natural to simply substitute each of 
these values in turn into the equation). 


In the next worked exercise you are asked to prove a statement that 
involves sets. The proof requires a separate argument for each of the two 
implications that make up the equivalence. 


Worked Exercise A56 


Let A and B be any sets. Prove that 
AUB=A + BCA. 


Solution 


@. We prove the — > direction first; that is, we assume that 
AUB=A, %& 


Suppose AU B= A. 


®. We want to deduce that B C A, that is, that if x € B then z € A. 
So we pick an element x € B. © 


Let x € B. Then z is also in the union AU B. But since AU B= A, 
this implies that x € A. Therefore B C A, so we have shown that 
AUB= A == BGA: 


®. Now we prove the <— direction. & 


For the converse, assume that B C A. 


®. We want to show that AU B = A. The equality holds if both 
AUBC Aand ACAUB. The inclusion A C AUB follows 
immediately from the definition of AU B. So we really want to show 
that the condition A U B C A holds, that is, that if x € AU B then 
ceA, & 


Let x € AUB. Then z € A or x € B. If x € B, then z € A, because 
B C A by assumption. Therefore x € A, and so AU B C A. 


Since A C AU B always holds, it follows that A U B = A, so we have 
shown that BC A => AUB=A. 


®. Finally, we can state our conclusion. .® 


Hence AWE =A es BCA. 


Exercise A113 


Prove the following equivalences. 
(a) niseven 4— n+ 8 is even 
(b) ACANB <=> ACB. 


Remember from Subsection 1.4 that an alternative way to express the 
equivalence P <— > Q is to assert 


P — Q and (not P = + not Q). 


Thus the ‘if’ part of the equivalence — that is, Q => P — can be proved 
by an argument that shows ‘not P = > not Q’. This is sometimes 
convenient, as the next worked exercise shows. 


Worked Exercise A57 


Let n be a positive integer. Prove that 


3 


n is even <= n” is even. 


Solution 


@. We start by proving the —> direction; that is, we assume that n 
is even and we want to deduce that n° is even. ® 


Let n be an even integer. Then 
n = 2k for some integer k. 


Hence 


2 Direct proof 


219 


Unit A3 Mathematical language and proof 


220 


Since 4k? is an integer, this shows that n? is an even integer. Thus we 
have shown that if n is even, then n° is even. 


®. We now need to prove the <— direction; that is, we want to 
deduce that n is even from the assumption that n? is even. Writing 
n? = 2k for some integer k does not seem to help in any obvious way 
to establish that n is even. Thus we try another approach: we assume 
that n is not even — that is, n is odd — and deduce that n° is odd. 
That is, we prove the contrapositive of the implication ‘if n? is even, 
then n is even’, which is equivalent to the implication. .© 


Now assume that n is odd. Then 
n = 2k + 1 for some integer k. 
Hence 
3 2 ene 2 = 3 2 
n = (QR) eh ele age eal oe age Spa a 
Since 4k? + 6k? + 3k is an integer, this shows that n° is an odd integer. 


Thus we have shown that if n is odd, then n? is odd, which is 
equivalent to the statement that n is even whenever n° is even. 


3 


Hence n is even <> n’ is even, as required. 


In the solution to Worked Exercise A57, the proof of the <— direction is 
an example of proof by contraposition, a method that you will look at in 
detail in Subsection 3.2. 


If you decide to prove an equivalence by using a sequence of <=>, as in 
Worked Exercise A55, you must be sure that its use is valid at each step; 
in other words, that both implications hold. This advice holds in general 
for proofs where it may be tempting to use a sequence of equivalences, 
rather than to look at each implication separately. The next worked 
exercise shows an example of a rash use of a sequence of equivalences. 


Worked Exercise A58 


Consider the following exercise. 


Let n be a positive integer. Prove that 
n is a multiple of 5 if and only if n? is a multiple of 5. 


Explain why the proof below is incomplete. 


Proof (incorrect!) 
n is a multiple of 5 == n= 5k for some integer k 
<> n? = 25k? for some integer k 


<=> n? = 5(5k°) for some integer k 


<> n? is a multiple of 5. 


In the solution to Worked Exercise A58, separating the two implications to 
be proved would have helped avoid the issue with the incorrect proof. In 
Worked Exercise A70 in Section 3.2 you will see a proof of the implication 


if n? is a multiple of 5, then n is a multiple of 5 


by contraposition, the method that we also used to prove the <== 
direction in Worked Exercise A57. This implication can also be proved 
directly by using the Fundamental Theorem of Arithmetic and the fact 
that if a prime number p divides a product ab, then p divides a or p 
divides b. 


Many theorems whose statement contains an equivalence have the form 
If P, then (Q if and only if R). 

or, equivalently, 
Suppose P. Then (Q if and only if R). 


In these cases, the assumption P holds throughout the proof. In addition, 
you assume Q when proving the implication ‘if Q, then R’, and you 
assume R when you prove the converse implication ‘if R, then Q’. The 
Factor Theorem (Theorem A2 in Unit A2) has this form, and you already 
know enough to work through its proof. 


2 Direct proof 


221 


Unit A3 Mathematical language and proof 


222 


This proof is longer and it may require more work to understand than the 
examples you have seen so far. However, it is a good example of how the 
ideas in this subsection, and in previous ones, appear in mathematical 
practice. If you get stuck with the details of the deductions, try to 
concentrate on the structure of the proof and come back to it when you 
have had more practice. 


Theorem A2 Factor Theorem (in R) 


Let p(x) be a real polynomial, and let a € R. Then p(a) = 0 if and 
only if x — a is a factor of p(x). 


Proof ®. Throughout the proof, we assume that p(x) is a real 
polynomial and that a € R. Under these assumptions, we need to prove an 
equivalence. We tackle each implication separately, and we start with the 
‘if? direction. So we start by assuming that x — a is a factor of p(x). ® 


Assume first that 2 — a is a factor of p(x), that is, assume that 


p(x) = (z — a)q(x) 
for some real polynomial q(x) whose degree is lower than the degree 
of p(x). Then 
pla) = (a— a)q(a) = 0, 
as required. 
®. We now prove the ‘only if’ direction, so we assume that p(a) = 0. & 


Now assume that p(a) = 0, and let 
p(x) = anz” + An—12" 14---+ a2 + ao, 


where an # 0. Since p(a) = 0, we have 
p(x) = p(x) — p(a) 
= (anz" + an1"! +- -- + ax + ao) 
— (ana” + an1a"! +--+ aa + ap) 
=@,(2" — a”) + an1 (2! — a7!) +--+ a(z — a), 
since the constant terms ag cancel. 
Now, by Theorem A12 (the Geometric Series Identity), we know that 


x — a is a factor of each of the bracketed terms in this last expression, and 
so it is a factor of p(x), as required. 


This concludes our proof. | 


So far we have discussed proof only in the context of implications (and 
equivalences — though an equivalence is just two implications). Much of 
what we have said extends to proofs of other types of statements. A 
statement Q that is not an implication can be proved by building up a 
sequence of statements leading to Q in the way that we have seen for an 


implication, except that there is no assumption P to be made at the start. 
Instead, the first statement in the sequence must be one that we know to 
be true from our previous mathematical knowledge. 


In the next section we apply what you have learned so far to proving 
existential and universal statements. 


2.3 Proving existential and universal 
statements 


Statement 4 at the start of Subsection 1.1 is an example of a statement 
that is not an implication, nor an equivalence — it is an existential 
statement: 


There is a real number x such that cos gx = z. 


Existential statements can sometimes be proved by finding an object that 
satisfies the property in the statement. 


Worked Exercise A59 


Prove that there is a positive real number x such that x < /Z. 


Here is another example for you to try. 


Exercise A114 


Prove that there is an integer n such that 3” > 9”. 


2 Direct proof 


223 


Unit A3 Mathematical language and proof 


224 


However, constructing a mathematical object with a given property can be 
considerably harder than this, and even impossible: for example, we have 
no way to find an exact solution to the equation 


cosx = T; 


and so we need an alternative way to prove Statement 4. We can note that 
the graphs of the functions f(x) = cosx and g(x) = x intersect at least 
once, so the equation does have a solution. However, for a rigorous proof 
we need the Intermediate Value Theorem which is proved later in the 
module. In cases where an example is hard to find, other methods of proof 
should be tried. On the other hand, when an existential statement can be 
proved by explicitly describing a mathematical object, it is important to 
remember that one example suffices: it is bad style to give multiple 
examples. 


To prove a universal statement about an infinite set, however, you always 
need to give a general argument. You have already seen many examples of 
this in this unit, since many of the statements you have met so far are 
universal, though the universal quantifier is often implicit. For example, 
the statement in Worked Exercise A50, 


if n is odd, then n? is odd, 
could be rephrased as 
for all integers n, if n is odd, then n? is odd, 


whilst the statement in Worked Exercise A52 contains an explicit universal 
quantifier: 


for every integer n, the number n? + 3n? + 2n is divisible by 6. 


The proof of a universal statement is an argument, of the kind you have 
seen in previous examples in this unit, that applies to all the objects 
covered by the quantifier. It is important to remember that checking that 
the statement holds in particular instances, however many, does not 
constitute a proof of a universal statement about an infinite set. 


2.4 Counterexamples 


Proving that a statement is true can be difficult. However, you may 
suspect that a statement is false, and it can often (but not always!) be 
easier to deal with this situation, especially when the statement is 
universal. 


For example, recall that statements of the form 
P(x) => Q(z) 


are in fact universal statements where the universal quantifier ‘for all x’ is 
omitted by convention. So the negation of P(x) => Q(z) is 


There is x such that P(x) and not Q(x). 


Thus to prove that P(x) = > Q(z) is false, you just have to give one 
example of a case where the statement P(x) is true but the statement 
Q(x) is false. Such an example is called a counterexample to the 
implication. Here are two examples. 


Worked Exercise A60 


Show that each of the following implications about integers is false, by 
giving counterexamples. 


(a) If the product mn is a multiple of 4, then both m and n are multiples 
of 2. 


(b) Ifn is prime, then 2” — 1 is prime. 


Solution 


(a) Taking m = 4 and n = 1 provides a counterexample because then 
mn = 4, which is a multiple of 4, but n is not a multiple of 2. 
Hence the implication is false. 


(b) The number 11 is a counterexample because 11 is prime but 
211 _ 1 = 2047, which is not prime since 2047 = 23 x 89. Hence 
the implication is false. 


Remember that just one counterexample is sufficient. For example, you 
can show that the statement 


if x? > 4, then z > 2 


is false by considering the value x = —3. There is no need to show that 
every number z less than —2 is a counterexample, even though this is the 
case. 


There is no general method for finding counterexamples. For some 
statements, such as the statement in Worked Exercise A60(a), a little 
thought about the statement should suggest a suitable counterexample. 
For other statements, the quickest method may just be to try out different 
values for the variable (or variables) until you find a counterexample. For 
example, for the statement in Worked Exercise A60(b) we can repeatedly 
choose a prime number n, calculate 2” — 1 and check whether it is prime. 


In order to carry out this procedure, we need a method for checking 
whether a given number m is prime. We could simply check whether m is 
divisible by each of the integers between 2 and m — 1 inclusive, but this 
involves a large amount of calculation even for fairly small integers m. We 
can significantly reduce the amount of calculation needed by using the 
following fact, which holds for any integer m > 2: 


If m is not divisible by any of the primes less than or equal to ym, 
then m is a prime number. 


2 Direct proof 


225 


Unit A3 Mathematical language and proof 


226 


You will be asked to prove this statement in Subsection 3.2. Here is an 
example of its use. 


Worked Exercise A61 


Show that 127 is a prime number. 


Solution 


We have 127 = 11.3 to one decimal place, so the primes less than or 
equal to V127 are 2,3,5,7 and 11. Dividing 127 by each of these in 
turn gives a non-integer answer in each case, so 127 is prime. 


Exercise A115 


Give a counterexample to disprove each of the following implications. 
(a) Ifm-+n is even, then both m and n are even. 
(b) If x <2, then (x? — 2)? < 4. 


(c) Ifn is a positive integer, then 4” + 1 is prime. 


As with implications, you may sometimes suspect that an equivalence is 
false. To prove that an equivalence P <=> Q is false, you have to show 
that at least one of the implications P => Q and Q => P is false, 
which you can do by providing a counterexample; that is, you need a case 
where one of P or Q is true, and the other is false. 


Exercise A116 


Show that the equivalence 
r? =9 4 r=3 


is false. 


2.5 Proof by induction 


Mathematical induction is a powerful method of proof that is particularly 
useful for proving statements involving integers, but also has wider 
applications. 


The great French mathematician Henri Poincaré (1854-1912) 
described proof by mathematical induction as ‘mathematical 
reasoning par excellence’. 


Consider, for example, Statement 3 in our list at the beginning of 
Subsection 1.1: 


14+3+4+5+4---+(2n—1) =n? for each positive integer n. 
Let us denote the variable proposition 
14+3454---+(Qn-l1)=n? 


by P(n). It is easy to check that P(n) is true for small values of n; for 
example 

1=1?, 

1-3=2= 97, 

14+34+5=9=37, 


so certainly P(1), P(2) and P(3) are all true. But how can we prove that 
P(n) is true for all positive integers n? 


The method of induction works like this. Suppose that we wish to prove 
that a statement P(n), such as the one above, is true for all positive 
integers n. Now suppose that we have proved that the following two 
statements are true. 


1. PO); 
2. If P(k) is true, then so is P(k + 1), for k =1,2,.... 


Let us consider what we can deduce from this. Certainly P(1) is true, 
because that is statement 1. Also P(2) is true because, by statement 2, if 
P(1) is true, then so is P(2). Similarly, by statement 2, P(3) is true since 
P(2) is. Since this process goes on for ever, we can deduce that P(n) is 
true for all positive integers n. We thus have the following method. 


Principle of Mathematical Induction 
To prove that a statement P(n) is true for n = 1,2,...: 
1. show that P(1) is true 


2. show that the implication P(k) => P(k + 1) is true 
tor (= iB oso 


Mathematical induction is often compared to pushing over a line of 
dominoes — this is illustrated in Figure 2. Imagine a (possibly infinite!) line 
of dominoes set up in such a way that if any one domino falls then the 
next domino in line will fall too — this is analogous to step 2 above. Now 
imagine pushing over the first domino — this is analogous to step 1. The 
result is that all the dominoes fall! 


2 Direct proof 


Figure 2 Toppling dominoes 


227 


Unit A3 Mathematical language and proof 


Although indications of the method of mathematical induction can be 
found earlier, the first satisfactory formulations of the method are due 
to Pierre de Fermat (1601?-1665) in his work on number theory of 
1630 (although not published until 1670) and Blaise Pascal 
(1623-1662) in a book on arithmetical triangles of 1654. 


The term mathematical induction was introduced by the British 
mathematician Augustus De Morgan (1806-1871) in 1838 in an article 
he wrote for the Penny Cyclopedia. 


In the next worked exercise we apply mathematical induction to prove the 
statement mentioned at the beginning of this subsection. 


Augustus De Morgan 


Worked Exercise A62 


Prove that 
1434-04 Qn-—1l=n’, forn =1,2,.... 


Solution 

@. Write out P(n). #@ 

Let P(n) be the statement 1+3-+---+ (2n —1) = m 

@. Next, carry out step 1, that is, check that P(1) holds. &@ 

P(1) is true because 1 = 1?. 

®. Now proceed with step 2. State the assumption, P(k). & 

Now let k > 1, and assume that P(k) is true; that is, 
14+34+---+(2k-1) =k’. 

®. State the desired conclusion, P(k + 1). The final term on the 

left-hand side of P(k +1) is 2(k+1)-1=2k+1. # 

We wish to deduce that P(k + 1) is true; that is, 


143+4---+(2h-1)+(2k+1)=(k +1)’. 


®. Now prove that P(k) => P(k +1). It should help to start with 
the left-hand side of the equality in P(k + 1) and rearrange it in such 
a way that P(k) can be used. ©& 


Now 
THREE Ok On b= (seg et Oh a) keel) 
=k? +(2k+1) (by P(k)) 
=(P 
®. This proves that P(k) => P(k +1), so we write out our 
conclusions. ® 


228 


Thus we have shown that 
P PR) tore ee 


Hence, by mathematical induction, P(n) is true, for n = 1,2,.... 


Exercise A117 


Prove each of the following statements by mathematical induction. 
(a) 1+2+- +n = n(n +1), forn=1,2,.... 
(b) 134+ 23+---4+n3 = in?(n +1)’, forn=1,2,.... 


In the next worked exercise the argument used to prove P(k + 1) from 
P(k) involves a more sophisticated algebraic manipulation than in Worked 
Exercise A62 and Exercise A117. 


Worked Exercise A63 


Prove that 2°”+! + 5 is a multiple of 7, for n = 1,2,.... 


Solution 
Let P(n) be the statement 


gent =o is a momltiple of 7. 


P() is true becatise 2°77? j=% j= = 3 x 7. 
Now let k > 1, and assume that P(k) is true; that is, 


23k+1 4 5 is a multiple of 7. 


We wish to deduce that P(k + 1) is true; that is, 


23(k+1)+1 4 5 = 23k+4 | 5 isa multiple of 7. 


®@. We need an algebraic manipulation that creates the subexpression 
23k+1 on the right-hand side, so that we can use P(k). Now, the 
exponent of 2 in P(k +1) is 3k + 4. Note that 3k + 4 = 3 + (3k + 1), 
and 3k + 1 is the exponent of 2 in P(k). ® 
Now 

agis 

= 7 x Q3kt1 4 g 4 5 


2 Direct proof 


229 


Unit A3 Mathematical language and proof 


230 


The first term here is a multiple of 7, and 2?*+! +5 is a multiple of 7, 
by P(k). Therefore 23¥+4 + 5 is a multiple of 7. Thus we have shown 
that 

PU) = P(e 1s tor k= 2 ee 


Hence, by mathematical induction, P(n) is true, for n = 1,2,.... 


Mathematical induction can be adapted to deal with situations that differ 
a little from the standard one. For example, if a statement P(n) is not 
true for n = 1 but we wish to prove that it is true for n = 2,3,..., then we 
can do this by following the usual method, except that in step 1 we prove 
that P(2), rather than P(1), is true. This is analogous to pushing over the 
second domino in the line: the result is that all the dominoes except the 
first fall! 


Also, in step 2 we have to show that P(k) => P(k+1) fork =2,3..., 
rather than for k = 1,2,.... In the next worked exercise we prove that a 
statement is true for n = 7,8,.... 


Worked Exercise A64 


Prove that 3” < n! for all n > 7. 


Solution 
Let P(n) be the statement ‘3" < nl. 


®. We are told to prove the statement for all n > 7, so we consider 
P(7). # 


P(T) is true because 3’ = 2187 < 5040 = 7!. 
Now let k > 7, and assume that P(k) is true; that is, 
aL 
We wish to deduce that P(k + 1) is true; that is, 
a < (RIL 
Now 
gS 
<3 x (y 12105) 
<(k+1)k! (because k > 7, and hence k +1 > 8> 3) 
=(k+1)!. 
@®. The conclusion P(k) = P(k + 1) holds for all k > 7. # 
Hence P(k) == P(k+ 1), tor k= 7,8)... 


Hence, by mathematical induction, P(n) is true, for n = 7,8,.... 


P(n) happens to be false for n = 1,2,...,6 in Worked Exercise A64 (you 
can check this if you like). However, the proof does not require any 
mention of this fact. 


Exercise A118 


Prove each of the following statements by mathematical induction. 
(a) 42°-34 1 is a multiple of 5, for n = 2,3,.... 
(b) 5” <n! for all n > 12. 


Proof by induction is also useful in many cases where the statement to be 
proved does not concern a property of the integers. You have already met 
at least one theorem that can be proved in this way: Theorem A3 in 
Unit A2 concerns all real polynomials that have as many distinct roots as 
their degree. The general statement can be proved by showing that it 
holds for all real polynomials of degree n with n distinct roots, for each 

n € N. Below you will see that the proof applies the Principle of 
Mathematical Induction to the degree n. 


This proof, rather like the proof of the Factor Theorem in Subsection 2.2, 
is more advanced than the induction proofs you have seen so far in this 
subsection. Similar advice applies here as for the proof of the Factor 
Theorem: if the details of the deductions are not clear to you, try to 
concentrate on the structure of the proof, in particular on how the 
Principle of Mathematical Induction is used, and if necessary come back to 
the proof at a later stage. 


Theorem A3 
Let p(x) = anz” + anya"! +--+ + az + ap be a real polynomial, 
and suppose that p(x) has n distinct real roots a1, a2,...,Q@. Then 


p(x) = an(x — ay) (z — a2)--+ (£ — an). 


Proof ®. If we show that the result holds for all polynomials of degree n 
with n distinct roots, for n € N, then we have proved the general 
statement in the theorem. © 


We argue by induction on the degree n. Let P(n) be the statement 


If p(x) = anz” + an_yx""! +--+ + az + ag is a real polynomial with 


distinct roots a1, Q2,...,Q@n, then 
p(x) = an(x — ai) (z — a2) +++ (£ — an). 
®. Step 1 is to show that the statement holds for all polynomials of 
degree 1 with one real root; that is, we want to show that all polynomials 


of the form a,x + zo can be written in the form a(x — a1), where a, is a 
root. & 


2 Direct proof 


231 


Unit A3 Mathematical language and proof 


232 


P(1) is true since if p(x) = aix + ao (where a, # 0) is a real polynomial 
with root a1, then p(a;) = 0. So 
aja, + ao = 0, 
and so ag = —a aq 1. Thus 
p(x) = a,x — aya, = ai (x — a4), 
as required. 


®. In order to carry out step 2, we assume that the theorem holds for all 
polynomials of degree k with k distinct real roots, and we want to deduce 
that it holds for all polynomials of degree k +1. & 


Suppose that P(k) is true; that is, suppose that all polynomials of degree k 
with k distinct real roots have a factorisation of the form 

aklx — a )(@ — a2) +--+ (£ — ag) 
where Q1,...,@% are the roots. 
We wish to deduce that P(k +1) holds; that is, all polynomials of degree 
k+ 1 with k + 1 distinct real roots have a factorisation of the form 

Ok41( — Q1)(% — a2) +++ (2 — ak+1) 
where Q1,...,@%41 are the roots. 


So let 
q(x) = appit**! + apa" +--+ + agx + a9, 


where agı #0, be a polynomial of degree k + 1 with k + 1 distinct real 
roots. Let az41 be a root of q(x). By the Factor Theorem, we have that 
XL — p41 is a factor of g(x), so 

q(x) = (x — ak+1) (a), 


where r(x) is a polynomial of degree k. Moreover, the coefficient of x” in 
r(x) must be ag44. 


®. In order to apply P(k) to r(x), we also need to show that r(x) has k 
distinct roots. & 


Now let a be a root of q(x) other than ax41. Then g(a) = 0, that is, 
(a — ak+ı) rla) = 0. 


Since a Æ @41, we have a — a4, # 0, and so we must have r(a) = 0. 
Thus a is a root of r(x). Since q(x) has k + 1 distinct real roots, including 
Qz+1, it follows that r(x) has k distinct real roots. 


®. Since r(x) is a polynomial of degree k with k distinct roots, we can 
apply P(k). & 


By P(k), the polynomial r(x) has a factorisation 


r(x) = ak+ı (£ — a1)(a@ — a2) --- (£ — ag). 


Thus 
q(x) = (£ — On41) Ak41(@ — a1) (£ — a2) +++ (x — aK) 
= akı (x — a1) (x — a2) +++ (£ — ak) (£ — O41). 
Therefore q(x) has a factorisation of the required form. Thus we have 
shown that P(k) = > P(k+1), for k = 1,2,.... Hence, by mathematical 
induction, P(n) is true, for n = 1,2,.... | 


The next exercise asks you to use induction to give a rigorous proof of the 
powers property of congruences that appears in Theorem A10 in Unit A2 
(an informal proof was given in Unit A2). Recall that two integers a and b 
are congruent modulo n, written a = b (mod n), if a — b is a multiple of n. 
In the proof, you will need to use the multiplication property of 
congruences, which was also part of Theorem A10: 


If a = b (mod n) and c = d (mod n), then ac = bd (mod n). 


Exercise A119 


Let a, b and n be integers. Use the Principle of Mathematical Induction 
and the multiplication property of congruences to prove that 


if a = b (mod n), then a” = b™ (mod n), 
form =1,2,.... 


Hint: Since you need to prove this statement for m = 1,2,..., call the 
statement P(m) and use induction on m. 


Finally in this section, here is some advice to consolidate what you have 
learned about induction proofs. When you write a proof by induction 
make sure that you clearly identify the statement to be proved, P(n), and 
structure your proof as follows: 


e prove that P(1) holds (or P(no) for some initial no Æ 1) 

e write down P(k) and assume that it holds for a general k 

e state that we need to deduce P(k + 1), and write down P(k + 1) 
e deduce P(k +1) from P(k) 


e conclude that P(n) holds for all natural numbers n (or for all n > no 
where appropriate). 


If you are unsure about your proof, review it and check that it follows this 
structure; in particular, check that P(1) (or P(no) where appropriate) is 
proved correctly, and that you have used P(k) in the proof of P(k + 1). 


Not all the proofs by induction that you will meet in the module materials, 
or in other textbooks, will match this format exactly, but this advice 
should help you write your own induction proofs. Below is an example of 
what can go wrong if you do not follow this template. 


2 Direct proof 


233 


Unit A3 Mathematical language and proof 


Worked Exercise A65 
Consider the statement 
2142? 2 be ce Oa OE for He, Bec wes 


Explain why the proof below is not a correct proof by induction, and write 
a correct proof. 


Proof (incorrect!) Let P(n) be the statement 


gl 4924934... 4 97 Sant] _9 
P(1) is true because 2! = 2141 — 2 = 2? — 2. 
Now let k > 1. Assume P(k); that is, assume that 
gl 4924934... 49% = okt _ 9. 
We wish to deduce that P(k + 1) is true; that is 
gt 49249384... 4 Et = okt2 _ 9 
Dividing both sides of P(k + 1) by 2 gives 
LHI OP a OP 
hence 
2+2? 4---42" — okt! _ 2 by rearranging. 


Since we have obtained P(k), which we assume is true, we know that 
P(k +1) is true. 


Therefore P(n) is true for n = 1,2,... by mathematical induction. E 


234 


3 Indirect proof 


Exercise A120 


Consider the statement 
2° 41< 3", form=1,2,.... 


Explain why the following proof is not a correct induction argument, and 
give a correct proof. 


Proof (incorrect!) Let P(n) be the statement 2” +1 < 3”. 
P(1) is true since 2! + 1 = 3. 
Assume P(k); that is, assume that 2% + 1 < 3*. 
We wish to deduce that P(k + 1) is true; that is 

k+l 4 < ght, 
We have 

2E 4+1 <2(2¥ +1) 

=2x 3" (by P(k)). 

Since 2 x 3° < 3 x 3 = 3*+1, we have that P(k + 1) holds. 


Hence, by mathematical induction, P(n) is true, for n =1,2,.... E 


3 Indirect proof 


The proof methods that you will meet in this section are indirect in that 
they do not show directly that the statement to be proved is true. Instead, 
a proof by contradiction assumes that the statement to be proved is false 
and deduces a statement that cannot be true at the same time as some of 
the assumptions (or some other true mathematical statement), and in a 
proof by contraposition the contrapositive is proved, rather than the 
original implication. 


235 


Unit A3 Mathematical language and proof 


236 


3.1 Proof by contradiction 


Sometimes a useful approach to proving a statement is to ask yourself, 
‘Well, what if the statement were false?’. Consider the following example. 


Worked Exercise A66 


Prove that there is no positive real number a such that 


1 
a+-<2. 
a 


The proof above is an example of proof by contradiction. The idea is 
that if we wish to prove that a statement Q is true, then we begin by 
assuming that Q is false. We then attempt to deduce, using the method of 
a sequence of statements that you saw in Subsection 2.1, a statement that 
is definitely false, which in this context is called a contradiction. If this 
can be achieved, then since everything about our argument is valid except 
possibly the assumption that Q is false, and yet we have deduced a 
contradiction, we can conclude that the assumption is in fact false — in 
other words, Q is true. 


You have already met this kind of proof: the proof of Theorem A1 given in 
Unit A2 is a proof by contradiction. It is repeated below, with further 
explanation of the thinking behind it. 


Theorem Al 


There is no rational number x such that x? = 2. 


Proof ®. We assume that x is a rational number such that x? = 2 and 
aim for a contradiction. © 


Suppose for a contradiction that there is a rational number x such that 

r? = 2. Since x is rational, we can write x = p/q, where p € Z and q EN. 
By replacing p/q by an equivalent fraction in lowest terms, if necessary, we 
may assume that the highest common factor of p and q is 1 (that is, that p 
and q are coprime). The equation z? = 2 now becomes 


so 

p = 2¢°. (1) 
Therefore p? is even, which implies that p is even (we know that if p were 
odd, then p? would also be odd). So we can write p = 2r, where r is an 
integer, and equation (1) becomes 


(2r)? = 29". 
Therefore we have 
g = 2r’, 
that is, q? is even. By a similar argument to that for p, we deduce that q is 
even. 
®. If both p and q are even, then they are not coprime. & 


Since p and q are both even, 2 is a common factor of p and q. But we 
assumed p/q to be a fraction in its lowest terms, so this is a contradiction. 


®. Since we have obtained a contradiction, our assumption that x is a 
rational number such that x? = 2 must be false. © 


Therefore no rational number x exists such that x? = 2. E 


Exercise A121 


Show that there is no rational number x such that 2? = 


The English mathematician G. H. Hardy (1877-1947) described proof 
by contradiction as ‘one of a mathematician’s finest weapons’. One of 
his favourite examples was a proof by contradiction of the existence of 
infinitely many primes. A version of the proof is given next. A proof 
of this result was originally given by Euclid in about 300 BCE, and it 
was essentially a proof by contradiction. 


3 


Indirect proof 


237 


Unit A3 Mathematical language and proof 


238 


Theorem A14 


There are infinitely many prime numbers. 


Proof Suppose that there are only finitely many primes, p1, p2,...,Dn- 


Consider the integer 


N = pip2p3+++ Pn + 1. 
This integer is greater than each of the primes p1, p2,...,Pn, So by our 
assumption it is not prime. 
®. We can use the Fundamental Theorem of Arithmetic (Theorem A13) to 
deduce that N has a prime factor. ® 


Now N has a prime factor, p say, by the Fundamental Theorem of 
Arithmetic. But p cannot be any of the primes p1, p2,..., Pn, since dividing 
any one of these into N leaves the remainder 1. Thus, p is a prime other 
than p1, p2,...,Pn- This is a contradiction, so our assumption that there 
are only finitely many primes must be false. It follows that there are 
infinitely many primes. E 


In the next exercise you can practise proof by contradiction for statements 
that are similar to that in Worked Exercise A66, in that they assert the 
non-existence of numbers with a certain property. 


Exercise A122 


Use proof by contradiction to prove each of the following statements. 
(a) There are no real numbers a and b with ab > 5(a? + b?). 


(b) There are no integers m and n with 5m + 15n = 357. 


The next worked exercise uses proof by contradiction to prove a general 
statement about sets. 


Worked Exercise A67 


Prove that, for any two sets A and B, 
AN(B-A)=2@. 


Proof by contradiction can sometimes be used to prove an implication. To 
prove an implication P = > Q by contradiction, you should begin by 
assuming that the implication is false, hoping for a contradiction. That is, 
you should assume that P is true and Q is false. If under these 
assumptions you can deduce a contradiction, then you can conclude that if 
P is true, then Q must also be true, which is the required implication. 
Here is an example. 


Worked Exercise A608 


Prove that if n = ab where n > 0, then at least one of a and b is less than 
or equal to yn. 


Exercise A123 


Use proof by contradiction to prove that if n = a+ 2b, where a and b are 
positive real numbers, then a > in or b> qn. 


As a final example that applies what you have learned in this subsection to 
a result that you have already met, we give below a formal proof of 
Theorem A11 from Unit A2, restated as an equivalence. One of the 
implications is proved by contradiction. 


The usual advice applies that if the details of the proof are not all clear, 
for now you should concentrate on the structure of the proof, in particular 
on how the ‘only if’ direction is proved by contradiction. 


3 


Indirect proof 


239 


Unit A3 Mathematical language and proof 


240 


Theorem A11 


Let n and a be positive integers, with a in Zn. Then a has a 
multiplicative inverse in Zn if and only if a and n are coprime. 


Proof ®. We assume that n is a positive integer and a € Zp. We start 
by proving the ‘only if’ direction. We assume that a has a multiplicative 
inverse in Zp and we want to deduce that a and n are coprime. We argue 
by contradiction. .@ 


Suppose that a has a multiplicative inverse in Zp, and assume for a 
contradiction that a and n are not coprime, that is, that a and n have a 
common factor d > 1. 


Let b be the multiplicative inverse of a in Zp; then b x, a = 1, and so 
ba=kn+1 
for some integer k, and therefore ba — kn = 1. 


@. Any common factor of a and n is also a common factor of ba and kn, 
and therefore of ba — kn. ® 


Since d is a common factor of a and n, we have that d divides ba — kn. But 
this is a contradiction, since ba — kn = 1 and d > 1. 


®. The assumption that a and b are not coprime leads to a contradiction, 
so we conclude that a and b are coprime. ©& 


Therefore a and b are coprime. 


®. We now prove the ‘if’ direction in the equivalence, so we start by 
assuming that a and n are coprime. & 


Now let a and n be coprime. 


®. For this direction of the proof we use Euclid’s Algorithm, a method for 
finding the highest common factor of two positive integers that you met in 
Unit A2. By the Division Theorem, quoted in Subsection 1.3, we know 
that there are unique integers qı and rı such that 


n=qatr, withO<r, <a. 
Euclid’s Algorithm proceeds by applying the Division Theorem again to 
the remainder rı, and then successively repeating this step. & 


We apply Euclid’s Algorithm. From one step of the algorithm to the next, 
the remainder decreases by at least 1, so it must eventually reach 0. We 
have 


n=qatr, O<ry<a 

a = qor+7re2 0< r2 <r 

ri =qsra + 73 0< r3 <r 
Tm—2 = m'm-1 + 1m 0 < Tm < Tm-1 


Tm—-1 = Im4ilm +0. 


The final equation shows that rm is a factor of rm-1. 


®. Since rm is a factor of rm_1, it is also a factor of gnrm—1, and therefore 
of dmrm—-1 +1Tm- Hence rm is also a factor of rm—z, and so on up the 

list. # 

Therefore the penultimate equation shows that rm is a factor of rm—2, and 
so on. In this way, we find that rm is a factor of all the remainders 
TmsTm—1;-++,71, and so of both a and n. 


®. Since a and n are coprime by assumption, their only common factor 
is 1. @ 


Since we assumed that a and n are coprime, we deduce that rm = 1. 


Therefore the penultimate equation gives 
1 = Tm-2 — GmTm-1- 


By backwards substitution we find that there are integers k and d such 
that 1 = kn + da. Hence da = —kn + 1, that is, d xna = 1. 


If d € Zn, then d is a multiplicative inverse of a in Zp. 


If d ¢ Zn, we have d = b (mod n) for some b € Zn, where b 4 0 and 
b xna = 1. Hence b is a multiplicative inverse of a in Zp. 


Therefore in either case a has a multiplicative inverse in Zn, as required. E 


3.2 Proof by contraposition 


Recall that the contrapositive of the implication ‘if P, then Q’ is ‘if not Q, 
then not P’, where ‘not P’ and ‘not Q’ denote the negations of the 
statements P and Q, respectively. 


Since an implication and its contrapositive are equivalent, if you have 
proved one, then you have proved the other. Sometimes the easiest way to 
prove an implication is to prove its contrapositive instead. This is called 
proof by contraposition. Here is an example. The proof makes use of 
the following identity, which is a special case of the Geometric Series 
Identity (Theorem A12) that you met at the beginning of Section 2: for 
any real number x and any positive integer n, we have 


g” —1= (r 1) (tpr? tee tae + 1), (2) 


3 


Indirect proof 


241 


Unit A3 Mathematical language and proof 


242 


Worked Exercise A69 


Prove the following implication about positive integers n: 


if 2” — 1 is prime, then n is prime. 


Solution 


We prove the contrapositive of the implication, which is 
if n is not prime, then 2” — 1 is not prime. 
Suppose that n is a positive integer that is not prime. 


@. We consider two cases separately: the cases n = 1 and n> 1. 
Splitting into separate cases is sometimes an effective way to proceed 
in a proof. & 
If n = 1, then 2” — 1 = 2 — 1 = 1, which is not prime. 
If n > 1, then since n is not prime by our assumption, we can write 
n = ab, where 1 < a, b < n. Hence 
2 ee -i 

=P= i 

= (2° = 1((2)P1 + $2" +0), 
where the last line follows from equation (2) by taking x = 2° and 
m= 0 
Now 2° = Í > 1 sincera > 1, and also (2°) 4 Il since 
both a and 6 are greater than 1. Hence 2” — 1 is not prime. We have 


thus proved the required contrapositive implication in both the cases 
n = l and n > 1. Hence the original implication is also true. 


When proving results about integers, proof by contraposition is especially 
useful when the conclusion has the form ‘n is even’, or ‘n is odd’, or a 
combination of statements of this kind. Below is a different example that 
in fact gives a proof by contraposition of the missing implication in 
Worked Exercise A58 in Subsection 2.2. 


Recall from Subsection 3.2 of Unit A2 that for integers a,b and n, 
a = b (mod n) if (and only if) a and b have the same remainder on division 
by n. 


Worked Exercise A70 


Prove the ‘if’ direction of the statement in Worked Exercise A58; that is, 
prove that, if n is a positive integer, 


if n? is a multiple of 5, then n is a multiple of 5. 


Solution 


®, We have seen in Worked Exercise A58 that the assumption ‘n? is a 
multiple of 5’ does not help us reach the conclusion in any obvious 
way, so we try proving the contrapositive. © 


We prove the contrapositive implication, which is 
if n is not a multiple of 5, then n? is not a multiple of 5. 


Suppose that n is not a multiple of 5. 


Then, by the Division Theorem, n can be written as one of 
5k+1 or 54+ 2 or 54 +8 or 5k+4, 


for some integer k, and so one of the following holds: 


n = 1 (mod 5) or n = 2 (mod 5) or n = 3 (mod 5) or n = 4 (mod 5). 


We consider each of these cases. 


®. We use the powers property of congruences which you proved in 
Exercise A119. For congruences modulo 5 it follows from this result 
that if a = b (mod 5), then a? = b? (mod 5). @ 

If n = 1 (mod 5), then n- = 1 (mod 5). If n = 2 (mod 5), then 

n? = 2? =4 (mod 5). Similarly, if n = 3 (mod 5), then 

n? = 9 =4 (mod 5), and if n = 4 (mod 5), then n- = 16 = 1 (mod 5). 
®. These cases cover all the possibilities for n?, and in each case the 
remainder on dividing n? by 5 is not zero. ©& 


2 


Therefore n* is not a multiple of 5, as required. 


Since the contrapositive is true, the original implication is also true. 


Exercise A124 


Use proof by contraposition to prove each of the following statements 
about integers m and n. 


(a) Ifn?+2n+1 is even, then n is odd. 
(b) If mn is odd, then both m and n are odd. 


(c) If an integer n > 1 is not divisible by any of the primes less than or 
equal to yn, then n is a prime number. 


Hint: Use the result of Worked Exercise A68. 


The next worked exercise involves a statement about sets that can be 
proved rather neatly by contraposition. 


3 


Indirect proof 


243 


Unit A3 Mathematical language and proof 


Worked Exercise A71 


Prove that, for any sets A and B, 
if (A— B)U(B— A) = AU B, then AN B = Ø. 


Exercise A125 


Let A and B be sets. Prove that 
if AC B, then A-B=@. 


The final exercise in this subsection asks you to read critically an 
attempted proof by contraposition. 


Exercise A126 
Consider the statement 


if n? +3 is even, then n is odd. 


Explain why the argument below is not a correct proof of this statement, 
and give a correct proof. 


244 


Proof (incorrect!) We prove the contrapositive, that is 


if n is odd, then n? + 3 is even. 


Assume n is odd. Then n = 2k + 1 for some integer k, and so 
n? +3 = (2k +1)? +3 
= 8k? + 12k? +6k+1+3 
= 8k? + 12k? +6k+4 
= 2(4k? + 6k? + 3k + 2). 


This shows that n? + 3 is even, as required. 


The exercise above is an example of a common pitfall when trying to prove 
a statement by contraposition, that is, a mistake in finding the 
contrapositive. 


You may have found some of the ideas so far in this unit difficult to get 
used to; this is to be expected since reading and understanding 
mathematics, and writing mathematics clearly and accurately, can both be 
difficult at first. Your skills will improve as you gain experience. To 
accelerate this improvement, you should, when reading mathematics, try 
to make sure that you gain a clear understanding of exactly what each 
statement asserts. When writing mathematics, you should try to be as 
clear and accurate as you can. Include enough detail to make the 
argument clear, but omit any statements that are not necessary to reach 
the required conclusion. A good check is to read over your work and ask 
yourself whether you would be able to follow what you have written in six 
months’ time, when you have forgotten the thoughts and rough work that 
led to it. Use the solutions to the exercises and worked exercises in the 
module as models for good mathematical writing. 


You may find it helpful to revisit parts of Sections 1, 2 and 3 later in your 
study of the module. 


4 Equivalence relations 


In this section you can apply many of the ideas about careful, logical 
thinking and proof that you have learned in the previous sections of this 
unit to a new topic in which this approach is needed. This topic is the 
important one of equivalence relations. Equivalence relations occur 
throughout mathematics, and are particularly important in the group 
theory units of this module. 


4 Equivalence relations 


245 


Unit A3 Mathematical language and proof 


246 


4.1 What is an equivalence relation? 


As you would expect, an equivalence relation is a special type of relation, 
so we will start by looking at what is meant by a relation. 


In everyday life we often work with relations between objects. For 
example, is a child of is a relation between people: we might say ‘Emma 
Smith is a child of Stephen Smith’, for instance. Other examples of 
relations between people include is a descendant of and lives in the same 
street as. An example of a relation between other types of object is shares 
a border with, between countries of the world. For instance, we might say 
‘France shares a border with Germany’. 


In mathematics, too, we often work with relations between objects. For 
example, is a multiple of is a relation between the numbers in the set N. 
Thus we can make statements such as ‘6 is a multiple of 3’, which is true, 
and ‘5 is a multiple of 2’, which is false. Another example of a 
mathematical relation is is parallel to, applied to the lines in the plane. 


It is sometimes useful to denote a mathematical relation by a symbol, and 
in this module we usually use the symbol ~. For example, if we use ~ to 
denote the relation is a multiple of, then we write the statement ‘6 is a 
multiple of 3’ as 6 ~ 3. The symbol ~ is known as tilde (pronounced 
‘tilder’), and in mathematics is usually read as ‘twiddles’. So you can read 
the statement 6 ~ 3 as ‘6 twiddles 3’. Alternatively, you can read it as ‘6 is 
related to 3’. (In some texts you may see the symbol R rather than ~ used 
for ‘is related to’, so 6 ~ 3 would be written as 6 R3.) 


Some frequently used relations have their own special symbols. For 
example, the relation is less than is usually denoted by the special 
symbol <. Examples of the use of this symbol are the statement —2 < 1, 
which is true, and the statements 1 < —2 and 3 < 3, which are both false. 


Here is a precise definition of what we mean by a relation. 


Definition 
We say that ~ is a relation on a set X if, whenever x,y € X, the 
statement x ~ y is either true or false. 


If ~ is a relation on a set X and x ~ y is false for a particular pair of 
elements x and y in X, then we write x % y. 


Here are some more examples of relations. 


1. Is equal to is a relation on the set R. This is because, for any x,y in R, 
the statement ‘x is equal to y’ is either definitely true or definitely false. 
This relation is usually denoted by the special symbol =. For instance, 
the statement 3 = 3 is true, and the statement 3 = 7 is false. The 
relation ‘=’ has the unusual property that for any real number x, the 
only real number y such that x = y is x itself. 


4 Equivalence relations 
2. Is the derivative of is a relation on any set of functions. We can define 


g~ f ifg is the derivative of f. 


For example, let f, g and h be the real functions given by f(x) = z’, 
g(x) = 3x? and h(x) = 2e”. Then g ~ f because g is the derivative of f, 
and h ~ h because h is the derivative of h, but f £% g because f is not 
the derivative of g. 


i 345i 
3. On C, we can define a relation 
zy ~ z2 if |zy — z2| < 4; 
Le Tae 2/5 ~ 4.47 
that is, z1 ~ z2 if the distance between z1 and zg in the complex plane is 
less than or equal to 4. For example, (1 + i) ~ (2 — i) because iles 
i 
+i) — (2-8) =|-14 2i] = v5 < 4, \V5 = 2.24 
but (1 +i) £ (3 + 5i) because 523 
—i 
I1 +4) — (3 + 5i)| = |-2 — 4i| = V20 = 2v5 > 4 


i Figure 3 Some distances in 
(see Figure 3). 


the complex plane 


Exercise A127 


Let ~ be the relation on N defined by 
xz~y if x is a divisor of y. 


Which of the following statements are true? 
(a) 3~6 (b) 6~3 (c) 6~ 18 (d) 6~6 


Exercise A128 


Let ~ be the relation on R defined by 


x~y if #—~y is an integer. 
(a) Which of the following statements are true? 


G) 13453 (ii) 28.21 (iii) 24~—-5.4 


(b) (i) Write down a real number y such that 0.8 ~ y. 
(ii) Write down a real number z such that 0.8 % z. 


247 


Unit A3 Mathematical language and proof 


248 


In this section we are mainly interested in relations of a type known as 
equivalence relations. These are relations that have three special 
properties, as defined below. 


Definition 
A relation ~ on a set X is an equivalence relation if it has the 


following three properties. 
E1 Reflexivity For all x in X, 


IE PS 9B. 
E2 Symmetry For all x,y in X, 
ifa~y, hen y aT: 
E3 Transitivity For all z,y,z in X, 


ifa~yand y~ z, then z ~ z. 


If a relation has the first, second or third property above, then we say that 
it is reflexive, symmetric or transitive, respectively. 


The three properties reflexivity, symmetry and transitivity are independent 
in the sense that relations exist with every combination of the three 
properties. 


Note that if a relation ~ is symmetric, then x ~ y and y ~ x mean the 
same, and we can write either of these interchangeably. For example, the 
relation = on R is symmetric, so x = y and y = x mean the same. The 
relation < on R is not symmetric, so x < y and y < x mean different 
things. 

To help you understand the three properties, it can be helpful to think 
through whether they hold for some non-mathematical relations, as in the 
next worked exercise. 


Worked Exercise A72 


For each of the following non-mathematical relations on a set of people, 
state whether the relation is reflexive, symmetric and transitive, briefly 
justifying your answers, and hence state whether the relation is an 
equivalence relation. 

(a) ‘lives on the same street as’ 


(b) ‘is a descendant of’ 


Solution 


(a) E1 The relation ‘lives on the same street as’ is reflexive, because 
each person lives on the same street as themselves. 

E2 Itis also symmetric, because if person A lives on the same 
street as person B, then it follows that person B lives on the 
same street as person A. 

E3 Finally, it is transitive, because if person A lives on the same 
street as person B, and person B lives on the same street as 
person C, then person A lives on the same street as 
person C. 


Hence this relation is an equivalence relation. 
(b) E1 The relation ‘is a descendant of’ is not reflexive, because a 
person is not a descendant of themselves. 


E2 Nor is it symmetric, because if person A is a descendant of 
person B, then it does not follow that person B is a 
descendant of person A. 

E3 However, it is transitive, because if person A is a descendant 
of person B, and person B is a descendant of person C, then 
it follows that person A is a descendant of person C. 


Since this relation is not reflexive (or symmetric), it is not an 
equivalence relation. 


Exercise A129 


For each of the following non-mathematical relations on a set of people, 
state whether the relation is reflexive, symmetric and transitive, briefly 
justifying your answers, and hence state whether the relation is an 
equivalence relation. 


(a) ‘has sat next to’ 


(b) ‘was born in the same year as’ 


4 Equivalence relations 


249 


Unit A3 Mathematical language and proof 


250 


Here are two mathematical examples. 


Worked Exercise A73 


For each of the following relations on the set R, state whether the relation 
is reflexive, symmetric and transitive, briefly justifying your answers, and 
hence state whether the relation is an equivalence relation. 


(a) = (b) < 


Notice that the statements that you need to prove to show that a relation 
is reflexive, symmetric or transitive, which are given in the definition in the 
box shortly before Worked Exercise A72, start with the words ‘For all’ and 
are therefore universal statements. So to show that a relation ~ on a set X 
is reflexive, for example, you must show that x ~ x for all elements x 

in X: it is not enough to show that there exists an element x in X such 
that x ~ x. To show that a relation ~ on a set X is not reflexive, you just 
have to show that there is a counterexample — that is, an element x in X 
such that z % x. 


Similarly, to show that a relation ~ on a set X is symmetric, you must 
prove that x ~y = > y~ zx for every x,y E€ X, while to show that it is 
not symmetric, you just have to show that there is a counterexample, that 
is, a pair x,y € X for which this property does not hold. Analogous 
statements hold for the transitive property, involving triples x,y,z € X. 


Here is a worked exercise involving two mathematical relations that are 
more complicated than = and <. You met these two relations earlier in 
this subsection. 


4 Equivalence relations 


Worked Exercise A74 


For each relation below, determine whether it has the reflexive, symmetric 
and transitive properties, and hence state whether it is an equivalence 
relation. 


(a) The relation ~ defined on C by 
Zi ™ 29 if |z = zəl < 4. 
(b) The relation ~ defined on R by 


xz~y ifx-— yis an integer. 


Solution ; 
(a) E1 Letz €C. Then 
|z—z2|=0<4, 
so z ~ z. Thus ~ is reflexive. 40 Pe : > 
E2 Let 2,22 E€ C, and suppose that z1 ~ z2. Then |z1 — zg| < 4, 3 i 3 
so 6 


[z2 = 21| = |21 — z2| < 4. À . . 
Figure 4 Some distances in 


Hence z2 ~ z1. Thus ~ is symmetric. the coupler pleie 


E3 The relation ~ is not transitive, as demonstrated by the 
example z1 = 0, z2 = 3, z3 = 6 (illustrated in Figure 4): 


|z — z| = |O- 3| =|- 3| =3 < 4, so z1 ~ 22, 

|zo — z3| = |3 — 6| = | — 3| = 3 < 4, so z2 ~ z3, 
but 

|z1 — z3| = |0 — 6| = | — 6| = 6 > 4, soz £ 23. 


Since ~ is not transitive, it is not an equivalence relation. 
(b) E1 Let x €R. Then x — x = 0, which is an integer. Thus ~ is 
reflexive. 
E2 Let x,y € R, and suppose that x ~ y. Then x — y is an 
integer, say 
Fhe 
where k € Z. It follows that 


PN) = =, 


which is an integer. Hence y ~ x. Thus ~ is symmetric. 


251 


Unit A3 Mathematical language and proof 


252 


Here is a similar exercise for you to try. In part (e), note that in this unit 
we will take the definition of parallel to be ‘in the same direction as’. 
Thus any line is parallel to itself. In fact the word parallel may be defined 
to have either of two possible meanings: the meaning just mentioned, and 
the one given in Unit Al in which two lines are parallel if they never meet. 
Both definitions are accepted in mathematics; the only difference between 
them is that with the definition in Unit A1 a line is not parallel to itself. 


Part (f) of the exercise involves the integer part of a real number. For any 
real number zv, the integer part of x (also called the floor of x), denoted 
by |x], is the largest integer that is less than or equal to x. (You will meet 
this again in your study of functions in Unit A4 Real functions, graphs and 
conics.) For example, |4.3| = 4, |—4.3] = —5 and |4| = 4. So, for any real 
number x, the integer part |x] of x is obtained by rounding down to the 
nearest integer; the rounding is always down, no matter whether x is 
positive or negative. 


Exercise A130 


For each relation below, determine whether it has the reflexive, symmetric 
and transitive properties, and hence state whether it is an equivalence 
relation. 


(a) The relation ~ defined on Z by 
m~n if m—n is even. 

(b) The relation ~ defined on Z by 
m~n if m—n is odd. 

(c) The relation ~ defined on Z by 


m~n if m? +n? is even. 


(d) The relation ~ defined on C by 
zy ~ z2 if |z1| = | zo. 

(e) The relation ~ defined on the set of all lines in the plane by 
l ~l if the lines lı and l2 are parallel. 

(f) The relation ~ defined on R by 


g~vy if |e—y| =0. 


You have already met an important family of equivalence relations in 
Unit A2. Let n be any integer greater than 1, and consider the relation is 
congruent modulo n to on the set Z. As you have seen, this relation is 
usually denoted by the special symbol =, and we usually also include 
‘(mod n)’ to make it clear which value of n we are working with. For 
example, with n = 7, the statement 


1 = 8 (mod 7) 

is true, and the statement 
1 = 12 (mod 7) 

is false. 


You saw in Unit A2 that the reflexive, symmetric and transitive properties 
hold for congruence modulo n; these properties are the first three 


properties in Theorem A10 in that unit. So we have the following theorem. 


Theorem A15 


For any integer n > 1, congruence modulo n is an equivalence relation 
on Z. 


Very roughly, you can think of any equivalence relation as a relation that 
defines some kind of ‘equivalence’ on the objects in the set on which the 
relation is defined. 


4 Equivalence relations 


253 


Unit A3 Mathematical language and proof 


254 


For example, with the equivalence relation was born in the same year as on 
a set of people, two people are ‘equivalent’ if they were born in the same 
year. Imagine that you are selecting people to take part in a survey, and 
the only selection criterion is that you need to select ten people born in 
each year from 1950 to 1999. Then, as far as selecting people for the survey 
is concerned, you would consider two people to be ‘equivalent’ if they were 
born in the same year. For instance, if Ashok and Becky were both born in 
1992, then it doesn’t matter which of them you select: they are equivalent. 


Here are two mathematical examples. First, with the equivalence relation 

is equal to on the set R, two real numbers are ‘equivalent’ if they are equal. 
This is a very strict type of equivalence, where two objects are ‘equivalent’ 
only if they are exactly the same (though they might be written differently, 


such as 5, - and 0.5). 


Second, with the equivalence relation in Exercise A130(e), two lines in the 
plane are ‘equivalent’ if they are parallel. This equivalence between lines 
might be useful if we were interested only in the directions of lines and not 
in their positions in the plane. 


With the equivalence relation congruence modulo n, two integers are 
‘equivalent’ if they have the same remainder on division by n. We use this 
type of equivalence when we carry out modular arithmetic. 


Finally in this subsection, here is an interesting ‘spot the error’ exercise. It 
involves an incorrect proof that appears to show that if a relation is both 
symmetric and transitive, then it is also reflexive. If this were true, then 
we could define an equivalence relation to be a relation that is symmetric 
and transitive — we could omit the condition that it must also be reflexive. 
However, it is not true, as you are asked to show in the exercise. The error 
in the proof is very subtle. It highlights just how careful we need to be in 
mathematical arguments. 


Exercise A131 


Consider the following incorrect claim and incorrect proof. 


Claim (incorrect) 


Let ~ be a relation on a set X. If ~ is symmetric and transitive, then 
~ is also reflexive. 


Proof (incorrect) Suppose that ~ is symmetric and transitive. 
We will show that ~ is then also reflexive. Let x € X. We have to 
show that x ~ x. Let y be an element of X such that x ~ y. Then, 
since ~ is symmetric, we have y ~ x. Since z ~ y and y ~ a, and ~ is 
transitive, we have x ~ x, as required. Thus ~ is reflexive. a 


4 Equivalence relations 


(a) Show that the claim is incorrect by demonstrating that the relation ~ 
defined on R by 


zr~y ifry>o0 


is a counterexample. That is, you have to show that this relation ~ is 
symmetric and transitive, but not reflexive. 


(b) Try to spot the error in the given proof. (Do not worry if you cannot 
spot it, as it is subtle, but be impressed with yourself if you can! 
Make sure to look at the answer.) 


4.2 Equivalence classes 


We now look at the idea of an equivalence class. This idea is associated 
only with equivalence relations, not with relations in general. 


Definition 
Let ~ be an equivalence relation on a set X, and let x € X. Then the 
equivalence class of x, denoted by |æ], is the set 


(el te eX aa al. 


In other words, [x] is the set of all the elements in X that are related to x 
by the equivalence relation; that is, it is the set of all the elements in X 
that are equivalent to x, where the equivalence is given by the equivalence 
relation. Notice that [x] includes the element zx itself, because for an 
equivalence relation we have x ~ z. 


For example, consider the equivalence relation was born in the same year 
as on a set of people. The equivalence class of a particular person is the 
set of people who were born in the same year as that person, including the 
person themself. 


As a mathematical example, consider the equivalence relation on the set of 
lines in the plane defined by 


li ~l if the lines lı and l2 are parallel. 


You saw that this relation is an equivalence relation in Exercise A130(e). 

Consider any particular line l in the plane. Then the equivalence class [I] 
of | is the set of all the lines in the plane that are parallel to l, including l 
itself. 


255 


Unit A3 Mathematical language and proof 


l 
m p 
k 


Figure 5 Lines in the plane 


FS 
0£ 
oe 


Figure 6 Partitioning a set 


Let us think about this example a little more. Consider a particular line J, 
as illustrated in Figure 5. Notice if you choose any line, say m, that lies in 
the equivalence class |l] of J, then |m] and [l] are in fact the same set. On 
the other hand, if you choose a line, say k, that is not in the equivalence 
class [/]] of the original line l, then not only are the two sets |k] and [J] 
different sets, but in fact they are disjoint sets. (Remember that two sets 
are said to be disjoint if they have no elements in common.) 


In fact, you can see that essentially what has happened here is that the set 
of all the lines in the plane has been split into a collection of subsets, with 
each subset consisting of all the lines in a particular direction. 


So the equivalence classes of this particular equivalence relation split the 
set on which the relation is defined into a collection of subsets, such that 
each pair of these subsets is disjoint. Such a collection of subsets is known 
as a partition of the set, as defined below and illustrated in Figure 6. 


Definitions 


A collection of non-empty subsets of a set is a partition of the set if 
every pair of subsets in the collection is disjoint and the union of all 
the subsets in the collection is the whole set. 


We say that such a collection of subsets partitions the set. 


In other words, a collection of non-empty subsets of a set is a partition of 
the set if every element of the set belongs to exactly one of the subsets in 
the collection. 


In fact, for every equivalence relation, its equivalence classes form a 
partition of the set on which the relation is defined, as stated and proved 
below. 


Theorem A16 


The equivalence classes of an equivalence relation on a set X form a 
partition of the set X. 


Proof First, since every element x of X belongs to an equivalence class, 
namely its own equivalence class |æ], the union of the equivalence classes 
of ~ is the whole set X. 


To prove that the equivalence classes of ~ partition the set X, the other 
property that we have to show is that if x and y are any elements of X, 
then their equivalence classes [x] and [y] are either the same subset of X, 
or disjoint subsets of X. We can prove this as follows. 


Let x and y be elements of X, and suppose that |x] and fy] are not 
disjoint, that is, they have at least one element in common, say z. We will 
show that |x] and [y] must then be the same set, that is, [x] = [y]. 


®. To do this, we use the strategy for proving that two sets are equal given 
in Unit Al: we show that each set is a subset of the other. .@ 


First we show that [æ] C [y]. Suppose that a € [æ]; we have to show that 
a € [y]. Since both a and z are in [æ], we know that x ~ a and z ~ z. 
Hence (since the relation ~ is symmetric and transitive) we have a ~ z. 
But we also know that y ~ z, because z € fy], so (again since ~ is 
symmetric and transitive) it follows that y ~ a. Hence a € [y], as claimed. 


We can show in the same way that |y] C [a] (we interchange the roles of x 
and y in the proof that [x] € [y]). 


Hence [x] = [y]. This completes the proof. a 


The proposition below was proved as part of the proof of Theorem A16 
above, and you saw it illustrated for a particular equivalence relation (the 
one involving lines in the plane) near the start of this subsection. It is an 
important fact to keep in mind when you are working with equivalence 
classes. 


Proposition A17 


The equivalence classes of an equivalence relation on a set X have the 
following property: if x and y are elements of X, then their 
equivalence classes |] and |y] are either equal sets or disjoint sets. 


If you think about an equivalence relation as defining a type of 
‘equivalence’, then Theorem A16 seems true intuitively. Each equivalence 
class is a subset of elements that are all ‘equivalent’ to each other. Each 
element lies in such a class, and each element is not equivalent to any 
element outside its own class. 


As an example of Theorem A16, consider the equivalence relation was born 
in the same year as on a set of people. The equivalence class of each 
person is the set of people born in the same year as that person. So the 
whole set of people is partitioned into a set of classes: the class of people 
born in 1966, the class of people born in 1992, the class of people born in 
2001, and so on. Each person belongs to one of these classes, and each pair 
of the classes is disjoint. 


As a mathematical example of Theorem A16, consider the equivalence 
relation is equal to on R. Consider any number x € R. Since x = y only 
if y is the same number as x, the equivalence class of the real number x 
contains only the number « itself. So each element lies in a single-element 
equivalence class, as illustrated in Figure 7. For example, [0] = {0}, 

[1] = {1}, and so on. 


Q 


4 Equivalence relations 


© 


2 


Figure 7 Some equivalence 
classes of ‘is equal to’ 


257 


Unit A3 Mathematical language and proof 


As another mathematical example of Theorem A16, consider the 
equivalence relation congruence modulo 5 defined on Z. The equivalence 
class of 0 is the subset of Z containing all the integers that are congruent 
to 0 modulo 5. Similarly, the equivalence class of 1 is the subset of Z 
containing all the integers that are congruent to 1 modulo 5, and so on. 


That is, 
[0] = {..., —10, —5,0,5, 10, 15,...}, 
J1] = {...,—9,—4, 1,6,11, 16,...}, 


[2] = {...,-8, —3, 2,7, 12,17,.. 
[3] = {...,—7, —2, 3,8, 13, 18,.. 
[4] = {...,—6, —1, 4,9, 14, 19,...}. 


There are only five distinct equivalence classes since, for example, [5] is 
aia, the same set as [0], and [6] is the same set as [1], and so on. The 
collection of five equivalence classes partitions the set Z, as illustrated in 
Figure 8: every number in Z belongs to one of the five classes, and the five 
+, —9, —4,1,6,11,... classes are all disjoint from each other. 


In general, congruence modulo n partitions the set Z into n distinct 
equivalence classes. 


Notice that an equivalence class of an equivalence relation may be a finite 
set or an infinite set, and that an equivalence relation may have finitely 
many equivalence classes or infinitely many equivalence classes. 


...,—6,—-1,4,9,14,... The next worked exercise involves finding a particular equivalence class of 
) ’ a eas ea) another equivalence relation. 


Figure 8 ‘The five equivalence k 
classes of congruence modulo 5 Worked Exercise A75 


Find the equivalence class [3.7] of the equivalence relation defined on R by 


a~y ifx-—y is an integer. 


(You saw that this relation is an equivalence relation in Worked 
Exercise A74(b).) 


N 


58 


Exercise A132 


Find the equivalence class [1] of the equivalence relation ~ defined on Z by 
m~n if m—n is even. 


(You saw that this relation is an equivalence relation in Exercise A130(a).) 


If we want to find all the equivalence classes of an equivalence relation, 
then it often helps to start by finding a particular equivalence class, or a 
few particular equivalence classes, as demonstrated in the next worked 
exercise. This can help us to see what happens in general. 


4 Equivalence relations 


259 


Unit A3 Mathematical language and proof 


260 


Worked Exercise A76 


Let ~ be the equivalence relation defined on C by 
Zi ™ 29 if zl = zəl. 
(You saw that this relation is an equivalence relation in Exercise A130(d).) 


(a) Find the equivalence classes [0] and [i]. 


(b) Describe all the equivalence classes of ~. 


Solution 
(a) We have 
[Oo] ={zeC:0~ 2} 
= {2 €C: 0 = 
=h eC: k= 0} 
= O 
So [0] is the set containing the complex number 0 alone. 
Similarly, 
LETEC 
=R eC Se 
=z ece CA = 
So [2] is the set of all complex numbers of modulus 1. 
(b) In general, for any complex number zo, say, we have 
[zo] = (EC am 
= {z € C : [zo] = |z|} 
= E Cl S o 
So [Zo] is the set of all complex numbers with the same modulus 
as Z0. 


If |zo| = r, say, then 

lal = {2 € C: l =. 
This set forms the circle with centre the origin and radius r in 
the complex plane. 


Hence the equivalence classes of ~ are the circles in the complex 
plane with centre the origin. (The origin is an equivalence class 

containing just the complex number 0; it can be thought of as a 
circle of radius 0.) 


4 Equivalence relations 


Some of the equivalence classes of the equivalence relation in Worked A 
Exercise A76 are illustrated in Figure 9. They are the circles with centre 
the origin, together with the origin itself. Notice that, as expected, the 
equivalence classes partition the set on which the equivalence relation is 
defined. 


Exercise A133 


Determine all the equivalence classes of the equivalence relation ~ defined 


Figure 9 Some equivalence 
on Z by 


classes of the equivalence 
m~n if m—n is even. relation in Worked 


; ser p SERR . Exercise A76 
(You saw that this relation is an equivalence relation in Exercise A130(a), 


and you were asked to find the equivalence class [1] of this relation in 
Exercise A132.) 


Exercise A134 


Let ~ be the relation defined on R by 
z~y if|z]= |y]. 


(Remember that |x| denotes the integer part of x: the largest integer that 
is less than or equal to x; for example |4.72| = 4.) 


(a) Show that ~ is an equivalence relation. 
(b) Determine the equivalence classes [1] and [—4]. 


(c) Describe all the equivalence classes of ~. 


As a further exercise on equivalence classes, you are asked next to prove 
the converse of Theorem A16, namely that every partition of a set X gives 
rise to an equivalence relation on X whose equivalence classes are the 
subsets that make up the partition. 


Exercise A135 


Let X be a set, and suppose we are given a collection of non-empty subsets 
of X that forms a partition of X. Let ~ be the relation defined on X by 


x~y ifx and y belong to the same subset in the partition. 


Show that ~ is an equivalence relation on X. 


261 


Unit A3 Mathematical language and proof 


262 


Representatives of equivalence classes 


You have seen that if ~ is an equivalence relation on a set X, and x and y 
are two elements of X such that x ~ y, then [2] = [y]. Thus, in general, 
there is more than one way to denote each equivalence class using the 
notation | ]: a class can be denoted by [x] where x is any one of its 
elements. For example, consider again the equivalence classes of the 
equivalence relation congruence modulo 5: 

0] = {...,—-10, —5, 0,5, 10, 15,...}, 

= {...,—9,—4, 1,6,11, 16,...}, 

= {...,—8, —3, 2,7,12, 17,...}, 

= {...,—7,—2, 3,8,13, 18,...} 
= {...,—6, —1, 4,9, 14, 19,...}. 


A w% N = 


We can denote the first equivalence class here by [0], or by [5], or by 
[—5], and so on. Similarly, we can denote the second equivalence class by 
[1], or by [6], or by [—4], and so on; and similarly for the other 
equivalence classes. 


When we are working with an equivalence relation, it is sometimes useful 
to choose a particular element x in each equivalence class and normally 
denote the class by [|]. The element x that we choose is called a 
representative of the class. 


For example, for the equivalence relation congruence modulo 5, whose 
equivalence classes are listed above, the most convenient representatives 
for the five classes are 0, 1, 2, 3 and 4. 


In general, if ~ is an equivalence relation on a set X, then a set of elements 
of X that contains exactly one element from each equivalence class of ~ is 
called a set of representatives for the equivalence relation ~. For 
example, {0, 1,2,3,4} is a set of representatives for congruence modulo 5. 


More generally, for any integer n > 2, the equivalence relation congruence 
modulo n has n equivalence classes, and the most convenient set of 
representatives for them is {0,1,2,...,n—1}, as set out below. 


[0] = {...,-2n, —n,0,n, 2n,...}, 
[1] = {...,1-2n,1—n,1,1+n,1+ 2n,...}, 


In- 1] ={...,-n-—1,-1,n-—1,2n-—1,3n—-1,...}. 


In other words, the most convenient set of representatives for the 
equivalence relation congruence modulo n is the 

set Zn = {0,1,2,...,n— 1}, which you worked with in Subsection 3.3 of 
Unit A2. The definitions of the modular operations +n and Xn can be 
rephrased in terms of the equivalence classes of the equivalence relation 
congruence modulo n, as follows. 


For all a,b € Zn, 
a +n b is the integer in Z, that lies in the class [a + b], 


a Xn b is the integer in Zn that lies in the class [a x b]. 


For example, in Zs, 
3+54= 2, 


because 3 + 4 = 7 and the equivalence class [7]] of congruence modulo 5 
contains the element 2 of Zs. 


Exercise A136 


Use the definitions of +, and Xn above to calculate 4 +5 4 and 3 x5 4, 
writing out the details of your working. 


As another example of using representatives for equivalence classes, 
consider again the equivalence relation ~ defined on C by 


Zi ™ 29 if |z1| = zəl. 


The equivalence classes of this equivalence relation were found in Worked 
Exercise A76 to be all the sets of the form 


{z€C: |z| =r}, 


where r € R. That is, they are the circles in the complex plane with centre 
the origin, including the origin itself as a ‘circle of radius 0’. 


4 Equivalence relations 


263 


Unit A3 Mathematical language and proof 


264 


Consider the particular equivalence class 
{zEC: |z| = 4}, 


that is, the circle of radius 4, which is shown in Figure 10(a). This class 
contains the complex numbers 4, —4i and —2,/2(1 + i), for example, since 
all these complex numbers have modulus 4. So we could denote this 
equivalence class by any of [4], [—4i] or [—2V’2(1 + i), for example. We 
might decide that it is convenient to choose the representative 4, and 
denote the class by [4]. In general, the equivalence class 


{eC |z) =r}, 


of this equivalence relation contains the element r and so can be denoted 
by [r]. Some examples of this choice of representatives are shown in 
Figure 10(b). 


+ 191 py 


[5.2] 
4 1.9 45.2 
—2/2 (1 + 4) 
—4i 
(a) (b) 
Figure 10 For the equivalence relation given by 21 ~ z2 if |z1| = |z2l: 


(a) a particular equivalence class (b) some equivalence classes with 
representatives 


A set of complex numbers that contains exactly one element from each 
equivalence class of the equivalence relation ~ is the set [0,00), the set of 
all non-negative real numbers. So [0, 00) is a set of representatives for ~. 


Exercise A137 


Describe a set of representatives for each of the following equivalence 
relations. 


(a) The relation ~ defined on Z by 
m~n if m—n is even. 


(You saw that ~ is an equivalence relation in Exercise A130(a), and 
you were asked to find its equivalence classes in Exercise A133.) 


(b) The relation ~ defined on R by 
ney if [2] = Ly). 


(You were asked to show that ~ is an equivalence relation, and find 
its equivalence classes, in Exercise A134.) 


Congruence modulo 27 


To end this subsection we look at an equivalence relation that is similar to 
congruence modulo n on Z, but which is defined on R rather than Z, and 
in which the modulus is 27, rather than an integer n. You will see that 
this equivalence relation enables us to express concisely some results about 
complex numbers. 


This relation is the relation ~ defined on R by 
x~y if «—y = 27k for some integer k. 


We begin by showing that this relation actually is an equivalence relation. 
E1 LetvzéR. Then z — z = 0 = 2r x 0, so x ~ g. Thus ~ is reflexive. 


E2 Let x,y € R and suppose that x ~ y. Then 
z — y= 2rk 
for some integer k. Hence 
y — z = 2r(—k). 
Since —k is an integer, this shows that y ~ x. Thus ~ is symmetric. 
E3 Let x,y,z € Z and suppose that x ~ y and y ~ z. Then 
r—y=?2rj and y—z= 2k 
for some integers j and k. Hence 


zr—z=g%z—y+y-z=2r(j+ k). 


Since j + k is an integer, this shows that x ~ z. Thus ~ is transitive. 


Since ~ is reflexive, symmetric and transitive, it is an equivalence relation. 


The equivalence relation ~ above is known as congruence modulo 27. 
For this equivalence relation, we can use notation similar to the notation 
that we use for congruence modulo n. That is, rather than writing 


£N Yy, 
we can write 
x = y (mod 27). 


For example, 


Or T 


3 =3 (mod 2r), 
because 

or r 

— —--—-=2x27 

2 2 


4 Equivalence relations 


265 


Unit A3 Mathematical language and proof 


266 


You have seen that congruence modulo n on Z corresponds to modular 
arithmetic on the set Zp, which is a set of representatives of the 
equivalence classes of congruence modulo n. In a similar way, congruence 
modulo 27 on R corresponds to modular arithmetic on a set of 
representatives of the equivalence classes of congruence modulo 27. 

The equivalence classes of ~ are the sets of the form 


x] = {x + 2nz : n € Z}, 
where x € R. For example, one equivalence class is 
0] = {..., —4a, —27, 0, 277, 47, ...}, 


and another is 


[=] = 7x 3n m 5r On 
amie ORs Og ic 


A suitable set of representatives for the equivalence classes is the interval 
(—7,7], since every equivalence class has exactly one representative in this 
interval. Other intervals can be used, for example [0, 27r), but (—7, 7] is 
useful as it corresponds to our definition of the principal argument of a 
complex number. 


We define modular operations +2, and X2, on the interval (—7, 7] as 
follows. For all x,y € (—7, 7], 


x +2, Y is the real number in (—7, 7] that lies in the class [æ + yl, 
£ Xen Y is the real number in (—7,7] that lies in the class [ry]. 


For example, 


ee, le ant 
T Tr 2 E 2’ 

f T ƏT 3m . T : 

since 7 + ery and = contains the element -a of (—r, r]. This 


type of modular arithmetic is effectively what we do when we find the 
principal argument of a complex number arising from some calculation. 


Recall that the principal argument of a complex number z is denoted by 
Arg z. Arithmetic modulo 27 on the interval (—r, 7] gives us a concise way 
to express some results about complex numbers that involve principal 
arguments. For example, you saw in Unit A2 that, if z1 and zg are any two 
complex numbers, then Arg z1 + Arg z2 is an argument of z122, but is not 
necessarily the principal argument. The principal argument is 
Arg z +9, Arg z2, so we can now state that 

Arg(z1z2) = Arg 21 +27 Arg 22. 
You have now seen what congruence modulo 27 means. For any integer 
r € R, we can define congruence modulo r on R in a similar way to the 
way that congruence modulo 27 is defined, and it can be checked that this 
relation is an equivalence relation in a similar way to the argument above. 
In fact, the equivalence relation in Worked Exercise A74(b) is congruence 
modulo 1 on R. However, congruence modulo 27 is particularly useful, for 
the reasons you saw above. 


Summary 


In this unit you have been working with the bricks and mortar from which 
mathematics is built — the statements that express mathematical ideas and 
the proofs that establish which statements are true. 


You have met different types of mathematical statements and seen how 
they can be combined and negated to make new statements. You have 
encountered several different methods of proof — some direct, such as the 
Principle of Mathematical Induction, and others indirect, such as proof by 
contradiction and proof by contraposition. You have also practised writing 
your own proofs and learned how to critically analyse mathematical 
arguments. 


Skills such as these are not acquired easily, so do not be discouraged if you 
found some parts of this unit rather hard. There will be many more 
opportunities to read and write proofs as you work through the remaining 
units in this module, so your skills will develop as you continue your 
studies. 


Finally, you have been introduced to the important topic of an equivalence 
relation on a set — a precise way of defining which elements of a set we 
regard as equivalent or ‘the same’. You will make extensive use of 
equivalence relations in the group theory units of this module. 


Learning outcomes 


After working through this unit, you should be able to: 


e understand what is asserted by various types of mathematical 
statements, in particular implications and equivalences 


e negate a mathematical statement, including universal and existential 
statements 


e produce simple proofs of various types, including direct proofs, proofs by 
induction, by contradiction and by contraposition 


e disprove a universal statement by providing a counterexample 
e read and understand the logical structure of more complex proofs 


e critically analyse a mathematical argument to identify, explain and 
rectify mathematical errors 


e explain the meanings of a relation defined on a set, an equivalence 
relation and a partition of a set 


e determine whether a relation defined on a set is an equivalence relation 
by checking the reflexive, symmetric and transitive properties 


e understand that an equivalence relation partitions a set into equivalence 
classes, and determine the equivalence classes for an equivalence relation. 


Learning outcomes 


267 


Unit A3 Mathematical language and proof 


Solutions to exercises 


Solution to Exercise A101 


(a) This is a mathematical statement. Whether 
the statement is true or false depends on the value 
of the variable n, so the statement is a variable 
proposition. 


(b) This assertion is not a mathematical 
statement, as the property of ‘being small’ has not 
been defined mathematically, and so it is not 
precise enough. 

(c) Since {1,2,3,4} is not an integer, it cannot be 
even or odd. Therefore this assertion is neither 
true nor false, and so it is not a mathematical 
statement. 


(d) This is a mathematical statement (a false 
one). It contains no variable, and so is not a 
variable proposition. 


Solution to Exercise A102 


(a) The negation can be expressed as 


p= 3 is not a solution of the equation 
32 +5=0. 


(b) The negation can be expressed as 


The equation n? + n — 2 = 0 does not have 
exactly two solutions 


or, more precisely, as 


The equation n? +n — 2 = 0 has either no 
solution, exactly one solution or more than 
two solutions. 


Solution to Exercise A103 


(a) The negation is ‘it is not the case that both x 
and y are integers’; that is, ‘at least one of x or y is 
not an integer’. Some equivalent formulations of 
this negation are 


either x or y is not an integer, 


or 


céZory¢Z. 


268 


(b) The statement is equivalent to the conjunction 
‘m is even and n is odd’. The negation can be 
expressed as 


m is odd or n is even. 


(c) The statement is equivalent to the disjunction 
‘m is odd or n is odd’. The negation can be 
expressed as 


the integers m and n are both even. 


(d) The negation can be expressed as 


A#AGand BØ. 


Solution to Exercise A104 
(a) The statement can be rewritten as 
if z? — 2x + 1 = 0, then (x — 1)? = 0. 
This is true. 
(b) The statement can be rewritten as 
if n is odd, then n? is odd. 
This is true. 
(c) The statement can be rewritten as 


if a given integer is divisible by 3, then it is 
also divisible by 6. 


This is false. 

(d) The statement can be rewritten as 
if x > 2, then x > 4. 

This is false. 

(e) The statement can be rewritten as 
if x < 0, then x <0. 


This is true. 


Solution to Exercise A105 
(a) The negation is 
m and n are odd, and m +n is not odd, 


that is, 


m and n are odd, and m + n is even. 


(b) The negation of ‘AU B= Ø or B — A = Ø’ is 
AUB#A@Gand B-A. 
Thus the negation of the implication is 


A= Ø, and AU B # Ø and B-AFD. 


Solution to Exercise A106 


(a) The converse is 
if m +n is even, then m and n are both odd. 


The given implication is true, and its converse is 
false. 


(b) The converse is 


if m +n is odd, then one of the pair m,n is 
even and the other is odd. 


The given implication and its converse are both 
true. 


Solution to Exercise A107 


(a) The converse is 


if at least one of m or n is even, then mn is 
even. 


The contrapositive is 
if both m and n are odd, then mn is odd. 


The converse is true, and so is the contrapositive. 
(Since the contrapositive is true, the original 
statement is also true). 
(b) The converse is 
if q divides m or q divides n, then q divides 
the product mn. 
The contrapositive is 
if q divides neither m nor n, then q does not 


divide the product mn. 


The converse is true, but the contrapositive (and 
hence the original statement) is false. 


Solution to Exercise A108 


(a) The two implications are ‘if the product mn is 
odd, then both m and n are odd’, and ‘if both m 
and n are odd, then the product mn is odd’. Both 
implications are true, so the equivalence is true. 


Solutions to exercises 


(b) The two implications are ‘if the product mn is 
even, then both m and n are even’, and ‘if both m 
and n are even, then the product mn is even’. The 
first implication is false, and the second is true. As 
at least one implication is false, the equivalence is 
false. 


Solution to Exercise A109 
(a) The negation is 


it is not the case that there is a real number x 
such that cos x = z; 


that is, 

there is no real number x such that cos z = z. 
Another way of expressing this negation is 

for all real numbers x, cosa Æ x. 
(b) The negation can be expressed as 


there is no integer that is divisible by 3 but 
not by 6, 


or, alternatively, 


every integer that is divisible by 3 is also 
divisible by 6. 


(c) The negation can be expressed as 


there is a real number x that does not satisfy 
the inequality x? > 0, 


or, alternatively, 


there is a real number x such that x? < 0. 


Solution to Exercise A110 


(a) Suppose that n is an even integer. Then 
n = 2k, where k is an integer, so 


n? = (2k)? = 4k? = 2(2k°). 


Since 2k? is an integer, this proves that n? is even, 
as required. 


(b) Let m and n be multiples of k. Then m = ka 
and n = kb, where a and b are integers. Hence 


m+n = ka + kb = k(a + b). 


Since a + b is an integer, we deduce that m +n is a 
multiple of k, as required. 


269 


Unit A3 Mathematical language and proof 


(c) Suppose that one of the pair m,n is even and 
the other is odd. Then one of them is equal to 2k 
and the other to 2l + 1, for some integers k and I. 
Then 


m+n=2k+ (+1) =2Ak+) 41. 


Since k + / is an integer, this shows that m + n is 
odd. 


(d) Let n be a positive integer. We note that 
n? +n=n(n+1). 


Either n or n + 1 must be even, so their product 
n? +n is even, as required. 


(Alternatively, the implication can be proved by 
considering two separate cases: the case where n is 
even, and the case where n is odd. However, the 
proof above is shorter and simpler.) 


Solution to Exercise A111 
The problem lies in the step 


r+1<0 => (z+1}? <0. 


This implication is false: take, for example, 

x = —2. The writer of the deduction seems to have 
used an incorrect assumption that an inequality is 
preserved by squaring its two sides, that is, that 
for real numbers a and b 


a<b = a <b. 


(This implication only holds under the additional 
assumption that a > 0.) 


Solution to Exercise A112 


The problem with this argument is that it starts 
by assuming the statement to be proved 

(|z1| = |z2|) and uses it to deduce a second 
statement that is known to be true (5 = 5). 
Deducing a true statement Q from a statement P 
does not tell us that P is true, so the truth of the 
second statement provides no information on the 
truth of the original statement. 


Below is a correct proof that shows that each side 
of the equality to be proved is equal to the same 
value. 


270 


Since z1 = 1 + 2i, we have 
jal = V12 +22 
= v5. 
Since z3 = V3 — iv2, we have 
(v3) + (- v2)” 
=V3+2 
= V5. 


Therefore |z| = v5 and |z2| = V5, so |z1| = |z2| as 
required. 


|z2| = 


(An alternative, but less obvious, proof starts with 
the left-hand side of the equality to be proved and 
shows directly that it is equal to the right-hand 


side. 
Ja] = v12 +22 
=y 
=v3+}2 
"E 
= |z]. 
Therefore |z1| = |z2|, as required.) 


Solution to Exercise A113 


(a) Assume that n is even. Then n = 2k for some 
integer k, and so 

n+8=2k+8 

=2(k +4). 

Since k + 4 is an integer, this shows that n + 8 is 
even. So 

n is even => n +8 is even. 
Now assume that n + 8 is even. Then n + 8 = 2k 
for some integer k, and so 

n=2k-8 

= 2(k — 4). 

Since k — 4 is an integer, this shows that n is even. 
So 

n +8 is even => n is even. 


Hence n is even <> n +8 is even. 


(b) Assume that A C AN B, and let x be such 
that x € A. Since A C ANB, it follows that 

x € AN B, so, in particular, x € B. Therefore 
AC B. So 


ALANS => AGB. 


Now assume that A C B, and let x be such that 
x € A. Then, since A C B, it follows that x € B, 
and so x € ANB. Hence AC AN B. So 


ACB = ACADNB. 
Hence AC ANB = ACB. 


Solution to Exercise A114 


The initial explanation of how to find a suitable 
integer is not a necessary part of the solution: it is 
included to show a possible way to find the example. 


The condition 3” > 9” is equivalent to 


3° 1\” 

—=(=) >L 

F= (5) 
This condition is satisfied by negative values of n, 
for example n = —1. 
Let n = —1. Then 3” = 4, 9” = 3 and t > $, so 
3” > 9”, as required. 


Solution to Exercise A115 


There are many other possible counterexamples in 
each part of this exercise. 


(a) Taking m = 1 and n = 3 provides a 
counterexample since then m +n = 4, which is 
even. 


(b) The number —3 is a counterexample because 
—3 < 2 but ((—3)? — 2)? = (9 — 2)? = 7? = 49, 
which is not less than 4. 


(c) We look for a counterexample. Here is a table 
for the first few values of n. 


n 1 2 3 
A°+1)5 17 65 


Since 4° + 1 = 65 is not a prime number, it provides 
a counterexample, so this implication is false. 


Solution to Exercise A116 
The implication 
r =9 = r=3 


is false, as —3 4 3 and (—3)? = 9, so z = —3 is a 
counterexample. Hence the equivalence is false. 


Solutions to exercises 


Solution to Exercise A117 

(a) Let P(n) be the statement 
1+2+ +n = in(n+ 1). 

P(1) is true since 1 = 41 (1 + 1). 

Let k > 1, and assume that P(k) is true; that is, 
1+2+ + k= ik(k+1). 

We wish to deduce that P(k + 1) is true; that is, 

1+2+ k+ (k+1)= 4(k+1)(k+2). 


Now 
14+24---+k4+(k+1) 
= sk(k +1)+(k+1) (by P(k)) 
= (k+1)($k+1) 
= į(k +1)(k +2). 
Thus, for k = 1,2,..., 
P(k) = P(k+1). 


Hence, by mathematical induction, P(n) is true, 
forn = 1,2,.... 


(b) Let P(n) be the statement 
parje? = gn?(n +1). 

P(1) is true since 
P=1 and §17(1+1)?=1. 

Let k > 1, and assume that P(k) is true; that is, 
1? +23 + +h 8 = Fk? (k+ 1). 

We wish to deduce that P(k + 1) is true; that is, 
1? +2 +- k? + (k +1)’ 

= }(k +1) (k +2). 


Now 
13 +24 k? + (k +1)’ 
= }k°(k +1) +(k+1)? (by P(k)) 
= (k +1)” (qk? + (k +1) 
= l(k + 1)?(k? + 4k + 4) 
= 1k +1) (k +2. 
Thus, for k = 1,2,..., 


P(k) => P(k+1). 


Hence, by mathematical induction, P(n) is true, 
forn = 1,2,.... 


271 


Unit A3 Mathematical language and proof 


Solution to Exercise A118 


(a) Let P(n) be the statement ‘4?"-° + 1 is a 
multiple of 5’. 


P(2) is true because 42%2-3 +1 = 4! +1 =5. 


Now let k > 2, and assume that P(k) is true; that 
is, 


4?k-3 + 1 is a multiple of 5. 


We wish to deduce that P(k + 1) is true; that is, 
42(k+1)-3 4 1 = 4?k-1 4 1 is a multiple of 5. 
Now 
42k-1 4 1 = 4242k-3 4 4 
= 16 x 4*3 +1 
= 15 x 453 4 43 4 1, 
The first term here is a multiple of 5, and 4°¥73 + 1 
is a multiple of 5, by P(k). Therefore 42*-! +1 is a 
multiple of 5. Hence 
Pik) => P(k+1), for k = 2,3,.... 
By mathematical induction, it follows that P(n) is 
true, for n= 2,3,.... 
(b) Let P(n) be the statement 5” < n!. 


P(12) is true because 51? = 2.44 x 108 and 
12! = 4.79 x 108, both to three significant figures. 


Now let k > 12, and assume that P(k) is true; that 
is, 


el, 
We wish to deduce that P(k + 1) is true; that is, 
BUD) < (k+1)1. 


Now 
5k+1 =B x gk 
<5xk! (by P(k)) 
< (K+ 1)k! 
= (k+1)!, 


where we have used the fact that k > 12, so 
k+1>13> 5. Thus we have shown that 


Pk) == Pp +1), for k = 12,13,.... 


Hence, by mathematical induction, P(n) is true, 
for n = 12,13,.... 


272 


Solution to Exercise A119 

Let P(m) be the statement 
if a = b (mod n), then a” = b™ (mod n). 

P(1) is the statement 
if a= b (mod n), then a = b (mod n), 

which is certainly true. 

Assume that P(k) is true; that is, assume that 
if a = b (mod n), then a” = b! (mod n). 

We wish to deduce that P(k + 1) is true; that is, 
if a = b (mod n), then a**1 = b¥+1 (mod n). 


So suppose a = b (mod n). Then, by P(k), we 
know that a* = b* (mod n). 


By the multiplication property of congruences, we 
have that 


att! = oF" (mod n). 


Hence P(k) = > P(k+ 1), fork =1,2,.... 


Thus, by mathematical induction, P(m) is true, for 
WW = 1, 2y200% 


Solution to Exercise A120 


The statement of P(n) and the proof of step 1 are 
correct. 


However, the ‘=’ sign in the argument 

oF 44 <2(2*+1)=—2x 3" (by P(k)) 
is incorrect: P(k) is an inequality, so we can at 
best conclude that 2(2* +1) < 2 x 3%. 


Moreover, even after replacing ‘=’ by ‘<’, all we 
can deduce is that 2° + 1 < 3*+!, which is not 

P(k +1). 

A correct proof of step 2 is as follows. 

Assume P(k); that is, assume that 2¥ +1 < 3*. We 
want to deduce that P(k + 1) is true; that is, 


okt q < ghtl, 


Now 
oer fia Be 

= 9% (2% 41) —1 
<2x3*-1 (by P(k)) 
3x31 
— 3k+1 —{ 
a gktl 

It follows that 2*+1 +1 < 35+, 

Thus P(k) =>, P(k+ 1), for k = 1,2,.... 


Hence, by mathematical induction, P(n) is true, 
for m= 1,2,.... 


Solution to Exercise A121 


Suppose that there exists a rational number x such 
that 2° = 2. Since z is rational, we can write 

x = p/q, where p and q are coprime positive 
integers. 


Then the equation z? = 2 becomes 


that is, 
p’? = 2”, 


which tells us that p? must be even. Now, the cube 
of an odd number, say 2k + 1 for some integer k, is 
odd because 


(2k + 1)? = 8k? + 12k? +6k +1 
= 2(4k3 + 6k? + 3k) +1, 
so p must be even, and hence it can be written as 
2r for some integer r. Then our equation becomes 


(2r)? = 29”, 
so we have 
È = 4r’. 


Hence q°, and therefore q, is also even, so 2 is a 
common factor of p and q. But p and q were 
assumed to be coprime, so we have obtained a 
contradiction. 


Therefore there is no rational number x such that 
3a 
q> = 2. 


(Alternatively, instead of proving that the cube of 


Solutions to exercises 


an odd number is odd, you could use the fact that 
a positive integer is even if and only if its cube is 
even, which was proved in Worked Exercise A57.) 


Solution to Exercise A122 


(a) Suppose that there exist real numbers a and b 
with ab > (a? + b°). 

Then a? — 2ab +b? < 0; that is, (a — b)? < 0. Since 
a square can never be negative this is a 
contradiction, so our supposition must be false. 
Hence there are no such real numbers a and b. 


(b) Suppose that there exist integers m and n 
with 5m + 15n = 357. 


Since m and n are integers, it follows that the 
left-hand side of this equation, 5m + 15n, is a 
multiple of 5. However, the right-hand side of the 
equation, 357, is not a multiple of 5. This is a 
contradiction, so our supposition must be false. 
Hence there are no such integers m and n. 


Solution to Exercise A123 
Suppose that n = a+ 2b, where a and b are 


positive real numbers. Suppose also that a < in 
and b < qn. Then 
n=at2b< $n +2 ($n) =n. 


Thus we have deduced that n < n. This 
contradiction shows that the supposition that 

a< in and b < in must be false; that is, we must 
have a > $n or b> in. 


Solution to Exercise A124 


(a) We prove the contrapositive implication, 
which is 
n is even => n? +2n + 1 is odd. 


Suppose that n is even. Then n = 2k for some 
integer k, and so 


n? + 2n +1 = (2k) +2 x2k+1 
= 8k? +4k +1 
= 2(4k? + 2k) +1. 
Since 4k? + 2k is an integer, n? + 2n + 1 is odd. 


Since the contrapositive is true, the original 
implication is also true. 


273 


Unit A3 Mathematical language and proof 


(Alternatively, you may have based your proof on 
the fact, proved in Worked Exercise A57, that a 
positive integer is even if and only if its cube is 
even.) 


(b) We prove the contrapositive implication, 
which is 
if at least one of m and n is even, then mn is 
even. 


Suppose that at least one of m and n is even; we 
can take it to be m (since otherwise we can just 
interchange m and n). Then m = 2k for some 
integer k. Hence mn = 2kn, which is even. 


(c) Let n be an integer that is greater than 1. We 
prove the contrapositive implication, which is 


if n is not a prime number, then n is divisible 
by at least one of the primes less than or 


equal to yn. 


Suppose that n is not a prime number. Then 

n = ab for some integers a, b, where 1 <a, b< n. 
By the result of Worked Exercise A68, at least one 
of a and b is less than or equal to yn. This 
number has a prime factor, which must also be less 
than or equal to yn, and this prime factor must 
also be a factor of n. This proves the required 
contrapositive implication. 


Solution to Exercise A125 


We prove the contrapositive implication, which is 
if A — B # Ø, thn AZ B. 


Suppose that A — B # Ø. Then there is an 
element x such that x € A but x ¢ B. It follows 
that A É B, as required. 


Solution to Exercise A126 


The proof is incorrect because it has used the 
converse of the statement to be proved, rather than 
its contrapositive. The contrapositive is 


if n is even, then n3 +3 is odd. 


An implication and its converse are not equivalent, 
therefore the given argument is not a proof of the 
original statement. Instead, it is a correct proof by 
contraposition of the implication 


if n? + 3 is odd, then n is even. 


274 


A correct proof of the contrapositive of the original 
statement is as follows. Suppose n is even. Then 
n = 2k for some integer k, and therefore 


n? +3 = (2k) +3 
=8k" +3 
= 2(4k? +1) +1. 
Since 4k? + 1 is an integer, this shows that n? + 3 
is odd, as required. 


Solution to Exercise A127 


Statement (b) is false and the other three 
statements are true. 


Solution to Exercise A128 


(a) (i) The statement 1.3 ~ 5.3 is true because 
1.3 — 5.3 = —4 is an integer. 

(ii) The statement 2.8 ~ 2.1 is false because 

2.8 — 2.1 = 0.7 is not an integer. 

(iii) The statement 2.4 ~ —5.4 is false because 
2.4 — (—5.4) = 2.4 + 5.4 = 7.8 is not an integer. 
(b) (i) A real number y such that 0.8 ~ y is 1.8, 
for example, since 0.8 — 1.8 = —1 is an integer. 
(ii) A real number z such that 0.8 % z is 0, for 
example, since 0.8 — 0 = 0.8 is not an integer. 


(There are many other possible solutions to 
part (b).) 


Solution to Exercise A129 


(a) E1 The relation ‘has sat next to’ is not 
reflexive, since no one has sat next to themself. 
E2 However, it is symmetric, because if person A 
has sat next to person B, then it follows that 
person B has sat next to person A. 

(Here we have assumed that when we say ‘A has 
sat next to B’ we mean that A and B have both 
been sitting next to each other: we do not allow 
the possibility that only A sat while B stood, for 
example.) 

E3 Finally, it is not transitive, because if 
person A has sat next to person B, and person B 
has sat next to person C, then it does not follow 
that person A has sat next to person C. 


Hence this relation is not an equivalence relation. 


(b) E1 The relation ‘was born in the same year 
as’ is reflexive, because each person was born in 
the same year as themself. 

E2 It is also symmetric, because if person A was 
born in the same year as person B, then it follows 
that person B was born in the same year as 
person A. 

E3 Finally, it is transitive, because if person A 
was born in the same year as person B, and 
person B was born in the same year as person C, 
then person A was born in the same year as 
person C. 


Hence this relation is an equivalence relation. 


Solution to Exercise A130 


(a) E1 Letn €Z. Then n- n = 0, which is even, 
son ~n. Thus ~ is reflexive. 

E2 Let m,n € Z and suppose that m ~ n. Then 
m — n is even. Since n — m = —(m — n), it follows 
that n — m is also even. Hence n ~ m. Thus ~ is 
symmetric. 

E3 Letl,m,n € Z and suppose that l ~ m and 

m ~n. Then l — m is even and m — n is even. 
Since the sum of two even numbers is also even, it 
follows that 


l-m+m-n=l-n 


is also even. Hence l ~ n. Thus ~ is transitive. 


Since ~ is reflexive, symmetric and transitive, it is 
an equivalence relation. 


(b) E1 The relation ~ is not reflexive: for 
example, we have 2 % 2, since 2 — 2 = 0 which is 
not odd. 

E2 Let m,n € Z and suppose that m ~ n. Then 
m — n is odd. Since n — m = —(m — n), it follows 
that n — m is also odd. Hence n ~ m. Thus ~ is 
symmetric. 

E3 The relation ~ is not transitive: for example, 
3 ~ 2 since 3 — 2 is odd, and 2 ~ 1 since 2 — 1 is 
odd, but 3 % 1 since 3 — 1 is even. 


Since ~ is not reflexive (or transitive), it is not an 
equivalence relation. 


(c) El Letn €Z. Then n? +n? = 2n?, which is 
even since n? is an integer, so n ~ n. Thus ~ is 
reflexive. 


Solutions to exercises 


E2 Let m,n € Z and suppose that m ~ n. Then 
m? + n? is even, and so n? + m? is also even. 
Hence n ~ m. Thus ~ is symmetric. 
E3 Letl,m,n € Z and suppose that l ~ m and 
m~n. Then I? + m? is even and m? + n? is even. 
Hence 
P4m?=2j and m? +n? =2k, 
where j,k € Z. Hence 


? =2j —m? 3 


and n?=2k—m ; 
so 
P +n? =2j) —m?4+2k—m? 
=2(j +k- m’), 
which is even, since j + k — m 
l~ n. Thus ~ is transitive. 


2 is an integer. Hence 


Since ~ is reflexive, symmetric and transitive, it is 
an equivalence relation. 


(d) E1 Let z €C. Then |z| = |z|, so z ~ z. 

Thus ~ is reflexive. 

E2 Let z1,z2 € C and suppose that z1 ~ z2. Then 
|z1| = |zg|, and so |z2| = |z1|. Hence z2 ~ 21. 

Thus ~ is symmetric. 

E3 Let 21, 22,23 E C and suppose that z1 ~ z2 
and z2 ~ z3. Then |z1| = |z2| and |z2| = |z3|. Hence 
|z1| = |zg|, that is, z1 ~ zg. Thus ~ is transitive. 


Since ~ is reflexive, symmetric and transitive, it is 
an equivalence relation. 


(e) E1 Let l bea line in the plane. Then / is 
parallel to itself, so l ~ l. Thus ~ is reflexive. 

E2 Let lı and lz be lines in the plane and suppose 
that lı ~ lg. Then lı is parallel to l2, so l2 is 
parallel to l1. That is, l2 ~ l1. Thus ~ is 
symmetric. 

E3 Let l, l2 and /3 be lines in the plane and 
suppose that lı ~ lg and lə ~ l3. Then lı is parallel 
to lə and ly is parallel to l3. It follows that 1, is 
parallel to l3, that is, lı ~ l3. Thus ~ is transitive. 


Since ~ is reflexive, symmetric and transitive, it is 
an equivalence relation. 


275 


Unit A3 Mathematical language and proof 


(£) E1 Let x eR. Then |z -— z| = [0] = 0, so 
x~ gx. Thus ~ is reflexive. 

E2 The relation ~ is not symmetric: for example, 
0.5 ~ 0, since 


[0.5 — 0] = [0.5] = 0, 
but 0 £ 0.5, since 

[0 —0.5| = |—0.5| = -1 40. 
E3 The relation ~ is not transitive: for example, 
1 ~ 0.5, since 


|1 —0.5| = [0.5] =0 

and 0.5 ~ 0, since 

10.5 — 0| = [0.5] =0, 
but 1 £ 0, since 

|1—0| =|1])=140. 


Since ~ is not symmetric (or transitive), it is not 
an equivalence relation. 


Solution to Exercise A131 


(a) We start by proving properties E2 (symmetry) 
and E3 (transitivity) for this relation ~, and then 
show that property E1 (reflexivity) does not hold. 


E2 Let x,y € R and suppose that x ~ y. Then 
xy > 0, from which it follows that yx > 0. Hence 
y~ x. Thus ~ is symmetric. 

E3 Let x,y,z € R and suppose that xz ~ y and 
y~ z. Then zy > 0 and yz > 0. By the first of 
these inequalities, x and y are either both positive 
or both negative, and by the second of the 
inequalities, y and z are either both positive or 
both negative. It follows that x, y and z are either 
all positive or all negative. Hence xz > 0. Thus ~ 
is transitive. 

E1 The relation ~ is not reflexive; for example, 

0 £ 0, because 0 x 0 = 0 which is not greater 
than 0. 


(b) The error in the proof is the statement ‘Let y 
be an element of X such that x ~ y’. This 
statement makes the assumption that there is such 
an element y, but there may not be. 


The argument in the proof is correct apart from 
this step, so it works for each element x that is 
related to another element y, but it does not work 
for an element x that is not related to any other 


276 


element in the set X. This is why taking ~ to be 
the relation defined in part (a), and taking x = 0, 
provides a counterexample: for this relation, there 
is no y € R such that 0 ~ y. 


Solution to Exercise A132 
We have 
[J ={rneZ:1~n} 

= {n E€ Z:1-—n is even} 
= {n E€ Z:1-—n = 2k for some integer k} 
= {n € Z: n = —2k + 1 for some integer k} 
= {n E€ Z:n = 2k +1 for some integer k} 
= {n € Z: nis odd}. 

So [1] is the set of odd integers. 


Solution to Exercise A133 


In Exercise A132 we found that [1] is the set of 
odd integers. 


We might suspect that the set of even integers is 
also an equivalence class. To check this, we can 
find the equivalence class [0]. We have 


[Oo] ={neZ:0~n} 
= {n E€ Z:0-n is even} 
= {n € Z : =n = 2k for some integer k} 
= {n € Z : n = —2k for some integer k} 
= {n € Z: nis even}. 
So, as suspected, [0] is the set of even integers. 
Since the set of even integers and the set of odd 


integers form a partition of the set Z, they are the 
only two equivalence classes of ~. 


Solution to Exercise A134 


(a) E1 Let x €R. Then |z] = |z], so z ~z. 
Thus ~ is reflexive. 

E2 Let x,y € R and suppose that x ~ y. Then 
|x| = ly], that is, |y] = |x]. Hence y ~ z. 

Thus ~ is symmetric. 

E3 Let z,y,z € R and suppose that x ~ y and 
y~z. Then |z] = |y] and |y] = |z]. Hence 

|x| = |z], that is, x ~ z. Thus ~ is transitive. 
Since ~ is reflexive, symmetric and transitive, it is 
an equivalence relation. 


(b) We have 
[J ={yeR:1~y} 
= ty ER: [1] = ly} 
= {ye R:1=|y]} 
={yeR: ly] =1} 
= [1,2). 
That is, the equivalence class [1] is the 
interval [1, 2). 
Similarly, we have 
[-4] ={y ER: -4 ~ y} 
={y ER: |—4| = ly]} 
={y ER: —4= |y]} 
={y ER: [y] =—4} 
= |—4, —3). 
That is, the equivalence class [—4] is the 
interval |—4, —3). 


(c) The equivalence classes of ~ are the intervals 
of the form [n,n + 1) where n is an integer. The 
collection of all such intervals partitions the set R. 


Solution to Exercise A135 


E1 Let « €X. Then z belongs to the same subset 
in the partition as itself, so x ~ x. Thus ~ is 
reflexive. 


E2 Let x,y € X and suppose that x ~ y. Then x 
and y belong to the same subset in the partition, 
so y ~ x. Thus ~ is symmetric. 


E3 Let x,y,z E€ X and suppose that x ~ y and 
y~ z. Then x and y belong to the same subset in 
the partition, and y and z belong to the same 
subset in the partition. It follows that x,y and z 
all belong to the same subset in the partition, so 
x~ z. Thus ~ is transitive. 


Since ~ is reflexive, symmetric and transitive, it is 
an equivalence relation. 


Solution to Exercise A136 


Since 4 + 4 = 8 and the equivalence class [8] of 
congruence modulo 5 contains the element 3 of Zs, 
we have 4+5 4 = 3. 


Similarly, since 3 x 4 = 12 and the equivalence 
class [12] of congruence modulo 5 contains the 


Solutions to exercises 


element 2 of Z5, we have 3 x5 4 = 2. 


Solution to Exercise A137 


(a) The solution to Exercise A133 shows that ~ 
has only two equivalence classes, namely the set of 
all even integers and the set of all odd integers. 


So a suitable set of representatives is the set {0,1}. 


(There are other choices, of course: any set 
containing exactly one even integer and exactly 
one odd integer, such as {22,7} or {4,—1}, is a set 
of representatives, but {0,1} (that is, Z2) is the 
most natural choice.) 


(b) In Exercise A134 it was found that the 
equivalence classes of ~ are the intervals of the 
form [n,n + 1) where n is an integer. 


A suitable set of representatives is Z. 


(There are other choices, such as the set 
{n + 5 Ine Z}, but Z is the most natural choice.) 


277 


Unit A4 
Real functions, graphs and conics 


1 Real functions and their graphs 


Introduction 


In this unit you will look at real functions and their graphs. The graph of a 
real function f is the set of points in R? with coordinates (x, f(x)), 
where x is in the domain of f. 


You will revise some basic real functions and their graphs, and see how 
some of the properties of these functions are featured in their graphs. You 
will learn how to apply similar principles to sketch the graphs of more 
complicated functions, including sums, quotients and composites of other 
functions, and functions that are defined by different rules for different 
values of x. 


Finally, you will revise conics, and see how functions may be used to 
represent curves in the plane even when the curves themselves are not the 
graphs of functions. 


Familiarity with basic calculus is assumed throughout this unit. 


1 Real functions and their graphs 


In this section you will revise a wide variety of real functions, and look at 
their graphs and some of their properties. 


1.1 Real functions 


In Unit Al Sets, functions and vectors, you saw that a real function is a 
function whose domain and codomain are both subsets of R. For example, 
the following are real functions: 


f : [0,2] — R ai g:R— R 
z— 24 — 5 r= r’. 


It is important to remember that a real function, like any function, consists 
of three things: a domain, a codomain and a rule. It does not consist solely 
of a rule. In this unit, we usually refer to a real function simply as a 
‘function’, unless there is a reason to emphasise that it is a real function. 


The notation used above can be a little unwieldy when used frequently, so 
we often simplify it by adopting certain conventions. For example, we 
usually write the function f above as 


fa)=2r—5 (ax € [0,2)). 


This type of notation specifies the rule and the domain of a function. It 
does not specify the codomain, but we use the convention that the 
codomain of a real function is R unless otherwise stated. 


If we do not want to give a function a name, such as f, then we can specify 
it by giving its rule and domain in the following manner: 


gr>2¢—5 (e [0,2]). 


281 


Unit A4 Real functions, graphs and conics 


282 


As well as omitting the codomain when we specify a real function, we can 
sometimes simplify further by omitting the domain too, and stating just 
the rule. For example, we might specify the function g above by writing 
simply 

g(x) = 2”. 


When we do this, we use the following convention. 


Convention for real functions 


When a real function is specified only by a rule, it is understood that 
the domain of the function is the set of all real numbers for which the 
rule is applicable, and the codomain of the function is R. 


For example, the function specified only by the rule 


1 
fo ieee 
does not have domain R, because the square root here is a real number 
only when 4 — x? > 0; that is, when x? < 4. This is true when z satisfies 
the inequalities —2 < x < 2 and for no other values of x. Furthermore, we 
cannot divide by 0, so we must exclude the values —2 and 2 from the 
domain. Thus the domain of f is the interval (—2, 2). 


Exercise A138 


For each of the following rules, determine the domain of the corresponding 
real function f. 


— b > 4r? — 3x? — 62 +4 
(a) z Ig (b) ax $ x z+ 
2 
a — 5x +4 1 
= d m —— 
0 po E 0 ro 


It can sometimes be useful to specify the domain of a real function as a 
union of the intervals on which it is defined. For example, the function 
f(x) = 1/x has domain R* = R — {0}, which can be rewritten in terms of a 
union of intervals as (—co, 0) U (0, co). 


Exercise A139 


For each of the functions in Exercise A138, write down the interval, or 
union of intervals, on which the function is defined. 


1 Real functions and their graphs 


1.2 Graphs of basic functions 


In this subsection we briefly review various families of basic real functions 
whose graphs you need to be able to recognise and sketch quickly. You 
should have met most of these functions in your previous studies. 


Remember that a sketch of the graph of a function is not intended to 
achieve the detail possible with a computer plot: instead, it should provide 
a visual summary of the main properties of the function. 


Constant functions 


The simplest family of functions is the family of constant functions, 
that is, functions of the form f(x) = b, where b is a real number. The 
graph of the function f(x) = b is a horizontal line with y-intercept b, as 
illustrated in Figure 1. 


b y=b 


> 
T 


Figure 1 The graph of a constant function 


Linear functions 


The next simplest family is the family of linear functions, that is, 
functions of the form f(x) = az +b, where a,b € R and a #0. The graph 
of the linear function f(x) = ax + b is the straight line with gradient a and 
y-intercept b, as illustrated in Figure 2(a). In particular, if b = 0, so that 
the function is of the form f(x) = ax, then its graph is the straight line 
through the origin with gradient a, as illustrated in Figure 2(b). 


y y =ar +b 
oa 
T 


o | 


Figure 2 Graphs of linear functions 


If a is positive, then the line slopes up as x increases; if a is negative, then 
it slopes down. It is straightforward to draw the graph of a linear function: 
you simply plot any two points on the graph and draw a straight line 
through them. 


283 


Unit A4 Real functions, graphs and conics 


Quadratic functions 


A quadratic function is a function of the form f(x) = az? + br + c, 
where a,b,c € R and a #0. The simplest quadratic function is the 
function f(x) = x7, whose graph is shown in Figure 3(a). The graph of 
every quadratic function has a symmetrical ‘cup’ shape, called a 
parabola, which is either the same way up as the graph of f(x) = x? or 
the other way up, as shown in Figure 3(b). This figure also reminds you 
what is meant by the vertex and the axis of a parabola. 


YA 


yr vertex 


axis 


Xy 


(a) 
Figure 3 Parabolas 


(b) 


One way to sketch the graph of a quadratic function f(x) = ax? + br + c is 
to first rearrange the expression ax? + bx + c into completed-square 
form (the next worked exercise reminds you how to do this). 


After we have completed the square, a quadratic function has the form 
f(x) =a(x—a)’ +8, 


where a is the same number as in the original expression, and a and 8 are 
numbers (which can be positive, negative or zero) that depend on the 
values of a, b and c. The vertex of the parabola is the point (a, 3), as 
illustrated in Figure 4, and the parabola is the same way up as the graph 
of f(x) = x? if a > 0, and the opposite way up if a < 0. The larger the 
magnitude of a, the steeper the parabola. 
r=a 

YA 

y =ar? +br +c 


Xy 


Figure 4 The graph of a quadratic function 


284 


1 Real functions and their graphs 


Worked Exercise A77 


By completing the square, find the vertex of the parabola that is the graph 
of the function f(x) = 3x? — 2x — 1. 


Exercise A140 


Match each of the following quadratic functions to its graph, by first 
completing the square in each case. 


(a) f(x) = 22? — 8r + 11 (b) f(z) = —22? — 8r — 5 
(c) f(x) =—-22? — 8z — 11 (d) f(x) = 2r? +8x +11 


W yr yr yr 
YVI IV ao 


Xy 
RY 
Sy 
Ry 


285 


Unit A4 Real functions, graphs and conics 


286 


Cubic functions 


A cubic function is a function of the form f(x) = ax? + ba? + cz + d, 
where a Æ 0. The simplest cubic function is the function f(x) = z, whose 
graph is shown in Figure 5(a). This graph has rotational symmetry 
because the graph is unchanged by rotation through an angle m about the 
origin; that is, rotating the part of the graph that corresponds to positive 
values of x through 7 radians about the origin gives the part of the graph 
that corresponds to negative values of x. 


However, there is more than one basic shape for the graph of a cubic 
function. Figures 5(b)—(d) show three more examples of such graphs, 
which illustrate the following features. 


e The graph of a cubic function f(x) = ax? + bx? + cx + d crosses the 
x-axis once or three times, or (more rarely) crosses it once and ‘touches’ 
it once. 


e If a > 0, then the graph is positive for large positive values of x and 
negative for large negative values of x. 


e If a < 0, then the graph is negative for large positive values of z and 
positive for large negative values of x. 


y 


(a) 


Figure 5 Graphs of cubic functions 


Linear rational functions 


A linear rational function is a function of the form 


ax +b 
r) = — 
where a,b,c,d € R are such that c Æ 0, and a and b are not both 0. Thus 
the numerator of a linear rational function is either a constant or a linear 
function, and the denominator is a linear function. 


The simplest linear rational function is the function f(x) = 1/x, known as 
the reciprocal function, whose graph is shown in Figure 6(a). The graph 
is in two parts, because the function f(x) = 1/x is not defined when x = 0. 
It has two asymptotes, namely the x- and y-axes. Remember that an 
asymptote of a curve is a straight line that the curve approaches 
arbitrarily closely as the domain variable x or the codomain variable y (or 
both) take very large values. 


In general, the graph of a linear rational function f(x) = (ax + b)/(cx + d) 
has a shape known as a rectangular hyperbola, with a horizontal asymptote 
y = a/c and a vertical asymptote x = —d/c, as illustrated in Figure 6(b). 


1 Real functions and their graphs 


The horizontal asymptote y = a/c arises because, for large positive and 
negative values of x, the value of the function is approximately 

ax/cx = a/c. The vertical asymptote x = —d/c arises because the function 
is undefined when the denominator cx + d is zero. 


Rv 


d asymptotes 


(a) (b) ec 


Figure 6 Graphs of linear rational functions 


Trigonometric functions 


Figure 7 shows the graphs of the trigonometric functions sine and cosine 
(usually abbreviated as sin and cos). These two graphs are very similar: 
they both look like waves in a horizontal strip between the lines y = —1 
and y = 1. Each of the two graphs is periodic with period 27, which 
means that the shape of the graph repeats every 27 units on the x-axis, 
but does not repeat like this at a shorter distance. So, in each graph, the 
shape in the shaded region of length 27 repeats indefinitely in both 
directions. The two graphs have exactly the same shape; the cosine graph 
is obtained by shifting the sine graph to the left by the distance 7/2. The 
graphs have gradient 1 or —1 where they cross the x-axis. 


1 y=sing 


S] 
bo 
3 
w 
3 
Xy 


-37 = T 
—1 

(b) 
Figure 7 Graphs of (a) sine and (b) cosine 


287 


Unit A4 Real functions, graphs and conics 


288 


Figure 8 shows the graph of the trigonometric function tangent (tan), 
which is given by tan x = (sin x)/(cos x). This graph is periodic with 
period 7, so the shape of the graph in the shaded region of length 7 
repeats indefinitely in both directions. The graph has gradient 1 where it 
crosses the x-axis, and a larger positive gradient at all other points on the 
graph. The function tan is undefined at each odd multiple of 7/2; that is, 
at x = +7/2, +30/2, +57 /2, and so on. The graph has a vertical 
asymptote at each of these values. 


y = tan £ 


w 
3 
Xy 


-3T — 2T T T 2T 


Figure 8 Graph of tan 


There are three more standard trigonometric functions, namely cosecant, 
secant and cotangent (cosec, sec and cot), given by 


COS T 


1 1 
cosecx = ——, secr = —— and cott = ——. 
sin x cos £ sin £ 


You will not often need to sketch the graphs of these functions, but they 


are included in Figure 9 for completeness. 
a y = cosec x 


a 

ow 

ş 
Xy 


(c) 


Figure 9 Graphs of (a) cosec, (b) sec and (c) cot 


1 Real functions and their graphs 


Exponential functions 


An exponential function is a function of the form f(x) = a”, where a is 
positive. (In some texts, the trivial case f(x) = 17, that is, f(x) = 1 is not 
regarded as an exponential function.) 


Figure 10(a) illustrates the shape of the graph of an exponential function 
f(x) = a” when a > 1, and Figure 10(b) illustrates the shape when 
0<a<1. An important feature of these graphs is that they lie completely 
above the x-axis, because a” is always greater than 0, even when z is 
negative. One useful point to plot is the point (0,1), since the graph of an 
exponential function always passes through this point. This is because 

a? = 1 for any positive number a. 


| 

| 

| 

| 

| > 
(a) _ 
Figure 10 Graphs of exponential functions 


Figure 11(a) shows the graph of the special exponential function 

f(x) = e”, sometimes called the exponential function. The value e is an 
irrational number, equal to 2.718 to three decimal places. You will be 
reminded why e is important later in the module. Figure 11(b) shows the 
graph of the exponential function f(x) = (1/e)”, more usually written as 


fiz) =e. 


| 

| 

l T 

| E 1/e- 
(a) i à =” (b) 1 T 


Figure 11 Graphs of exponential functions f(x) = e” and f(x) = e~” 


The first known use of the letter e to represent the number 2.718... is 
in a manuscript written by Leonhard Euler (1707-1783) in 1727 or 
1728, although not published until 1862. Euler’s first publication to 
contain e was his Mechanica of 1736. It is not known why he chose 
the letter e, but it is likely that it was because it was the next 
available letter in the alphabet, earlier letters already being in 
frequent use in mathematics. 


289 


Unit A4 Real functions, graphs and conics 


For the functions we have discussed so far in this subsection, the graph 
takes the form of a smooth curve on each interval in the domain. The next 
two functions are not so ‘well behaved’; their graphs are not smooth curves 
throughout their domains, because they have either ‘corners’ or ‘jumps’. 
You will learn more about functions with properties like these in the 
analysis units of this module. 


Modulus function 


Figure 12 shows the graph of the modulus function f(x) = |x|. As you 
saw in Unit 1, the notation |x|, usually read as ‘mod x’, means the 
modulus (also called absolute value or magnitude) of the number z, 
which is given by 


lol = x, ifx>0, 
a —z, ifx <0. 


For example, |3| = 3 and |—3| = 3. The graph of f(x) = |x| is the same as 
the graph of y = x when x > 0, and the same as the graph of y = —x when 
x < 0. It has a corner at the origin. 


Y y = |z| 


Y = -T yur 


Sy 


Figure 12 Graph of the modulus function 


Integer part function 


Figure 13 shows the graph of an even more peculiar function, the integer 
part function f(x) = |x|. For each z, the integer part |x| of x is 
obtained by rounding down to the nearest integer. The rounding is always 
down, no matter whether x is positive or negative. For example, 


12.8] =2, |[-2.8)=-3 and |2| =2. 


So the graph of f(x) = |x| consists of horizontal line segments with jumps 
between them. The left-hand endpoint of each line segment belongs to the 
graph, whereas the right-hand endpoint does not. This is indicated on the 
graph by the solid dot at each left-hand endpoint and the hollow dot at 
each right-hand endpoint. 


YA 
2- e—oy= |z| 
14 eo 
T T e—O T T > 
el 1 2 3 7 
eo 
e—o-—24 


Figure 13 Graph of the integer part function 


290 


1 Real functions and their graphs 


In some texts, the integer part of a number is denoted by [2] or Int(x), and 
sometimes the integer part function is called the floor function. 


Curves that are not the graphs of functions 


If f is a real function with domain A, then f maps each real number x 

in A to a single real number f(x). That is, each number in the domain has 
exactly one image. This tells us that if we take the graph of f and draw a 
vertical line through any number on the z-axis, then that vertical line 
must cross the graph at most once. It will cross it once if the number on 
the x-axis is in A, and it will not cross it at all otherwise. So, for example, 
Figure 14(a) is the graph of a function, whereas Figure 14(b) is not, 
because the vertical line crosses the curve three times. 


Yr Yr 


Ry 


(a) i 


Figure 14 (a) The graph of a function and (b) a curve that is not the graph 
of a function 


Recognising graphs 


You should now be able to recognise the graphs of functions that belong to 
the basic families covered in this subsection, or are related to them in a 
simple way. You should also be able to recognise curves that are not the 
graphs of functions. 


You can practise this using Figure 15. Each numbered part of the figure 
shows either 


e the graph of y = f(x), where f is a function belonging to, or closely 
related to, one of the families of functions discussed in this subsection, or 


e a curve in the plane that is not the graph of such a function. 


For example, here are some comments on the first four parts of Figure 15 
to get you started. 


e Part 1 is a (non-vertical) straight line, so it is the graph of a linear 
function. 


e Part 2 is cup-shaped, so it is the graph of a quadratic function. 


e In part 3, it is possible to draw a vertical straight line that crosses the 
curve more than once, so this curve is not the graph of y as some 
function of x. 


e Part 4 appears to be the graph of a function related to the modulus 
function. 


291 


Unit A4 Real functions, graphs and conics 


YA YA 
o 2- 
-T 
> T T > 
S= P 
17 18 
y YA 
94 
1- L 
damni 
~9 T 
21 22 


Figure 15 A selection of graphs and curves 


292 


1 Real functions and their graphs 


The next exercise asks you to classify each of the parts of Figure 15 in a 
similar way. 


Exercise A141 


Look through all of the numbered parts of Figure 15, fairly quickly, and 
try to decide which of the following families each belongs to. 

(a) Linear 

(b) Quadratic 

(c) Cubic 

(d) Trigonometric 
(e) Linear rational 
(£) Modulus (or related) 
(g) Integer part (or related) 
(h) Exponential (or related) 
( 


i) Not the graph of y as some function of x. 


1.3 Translations and scalings of graphs 


In this subsection we will revise some simple transformations that can be 
applied to graphs, and the effects that these transformations have on the 
rules of the corresponding functions. Knowing about these transformations 
allows you to recognise and sketch the graphs of a wide variety of functions 
related to those discussed in the previous subsection, and to understand 
some of the properties of such functions. 


Translations 


Figure 16 shows the graph of a function y = f(x), and the graph obtained 
by translating it by œ units to the right and 6 units upwards. This 
transformation is called an (a, 3)-translation. 


A 


YA 
y = f(x) (2’,y') 
(my) __y = (z +a,y +) 
i (œb) 
(0,0) T x! 


Figure 16 The effect of an (a, ()-translation 


293 


Unit A4 Real functions, graphs and conics 


294 


Let us determine the equation of the translated graph. To do this, it is 
helpful to temporarily use different symbols to denote the coordinates of 
points on the translated graph. Let us use z’ and y’, as shown on the axes 
of the translated graph in Figure 16. Now consider any point (2’, y’) on the 
translated graph. We want to determine the relationship between 2’ 
and y’. The point (2’, y’) on the translated graph is obtained by 
translating some point, say (x,y), on the original graph, so z’ = x + a and 
y' = y + B, that is, 

xr=x'—-a and y=y' — 8B. (1) 
We know that the relationship between the coordinates x and y of the 
point (x,y) on the original graph is given by the equation y = f(x). Using 
equations (1) to substitute into this equation gives 


y = p= f(r =a). 


This equation gives the relationship between the coordinates x’ and y’ of 
the point (x', y’) on the translated graph. If we now rearrange it to obtain 
just y’ on the left-hand side, and use the usual x and y rather than 2’ 

and y’ for the coordinates on the translated graph, then we obtain 


y= f(r—a)+ 8. 


So this is the equation of the translated graph. For example, if the original 
graph has equation y = x”, and the translation is a (3, 2)-translation, then 
the equation of the translated graph is y = (x — 3)? + 2. This quadratic 
expression, in its completed-square form, is y = a(x — a)? + 8 so, by the 
result you met in Subsection 1.2, its vertex is at (a, 3) = (3,2). This is just 
what we would expect, because (3,2) is the image of the origin under a 

(3, 2)-translation. 


The values a and £ in an (a, ()-translation can be positive, zero or 
negative. A negative value of a gives a translation to the left rather than 
to the right, and a negative value of ( gives a translation downwards 
rather than upwards. 


Scalings 


Figure 17 shows the graph of a function y = f(x), and the graph obtained 
by scaling (stretching) it by the factor A in the x-direction and the factor u 
in the y-direction, where A, u #0. This transformation is called a 

(A, )-scaling. 


YA 
y = f(x) 


(gy) . 


> 
T 


Figure 17 The effect of a (A, 1)-scaling 


1 Real functions and their graphs 


Let us determine the equation of the scaled graph. To do this, as before we 
use x’ and y’ to denote the coordinates of points on the scaled graph, as 
shown in Figure 17. Now consider any point (z’,y’) on the scaled graph. It 
is obtained from some point, say (x,y), on the original graph, where 
a! = dx and y' = py, that is, 

/ / 


T y 
w= and y 7 (2) 
We know that the relationship between the coordinates x and y of the 


point (x,y) is given by the equation y = f(x). Using equations (2) to 
substitute into this equation gives 


yoo x! 
*=s(§). 


This equation gives the relationship between the coordinates 2’ and y’ of 
the point (2’, y/) on the scaled graph. If we now rearrange it to obtain 
just y’ on the left-hand side, and use x and y rather than x’ and y’ for the 
coordinates, then we obtain 


v=ns(3) 


So this is the equation of the scaled graph. For example, if the original 
graph has equation y = x”, and the scaling is a (2, 1)-scaling, then the 
equation of the scaled graph is y = 1(2/2)?, which simplifies to y = x?/4. 


If the magnitude of A, the scale factor in the x-direction, is less than 1, 
then the graph ‘gets closer’ to the y-axis, and if the magnitude of A is 
greater than 1, then the graph ‘gets further away’ from the y-axis. A 
negative value of À causes the graph to be reflected in the y-axis as well as 
scaled. These facts also hold for u, the scale factor in the y-direction, with 
respect to the x-axis rather than the y-axis. 

Applying the transformations 


Here is a summary of the results about translations and scalings that you 
have seen in this subsection. 


Translations and scalings of graphs 


e Applying an (a, G)-translation to the graph of y = f(x) gives the 
graph of 


Cee re 
e Applying a (A, )-scaling to the graph of y = f(x) gives the graph of 


=ne) 


You might find it easier to appreciate what is happening with these 
translations and scalings from the following forms of these equations: 


y—G=f(x«-—a) and mle 


295 


Unit A4 Real functions, graphs and conics 


Worked Exercise A78 


The graph of y = sin z is shown below. 


Sketch the graphs of the following. 
(a) y=sin(x + =) (b) y =2sing (c) y = sin 2g 
Solution 


(a) @. Starting from the equation y = sin z, we replace x by x + 1/2 
to obtain the equation y = sin(x + 7/2). So we have a translation 
with a = —7/2 and 6 = 0; that is, a translation to the left by 
n/2. M 


The graph of y = sin (« + =) is obtained from the graph of 


y =sinz by a (-5, 0) -translation. 


: T 
y y=sin(«+<) 


-2r E ) 


(b) ®. Starting from the equation y = sin z, we multiply the 
right-hand side by 2 to obtain the equation y = 2sin z. So we 
have a scaling with À = 1 and p = 2; that is, a scaling by a factor 
of 2 vertically. @ 


The graph of y = 2sin x is obtained from the graph of y = sin z 
by a (1, 2)-scaling. 


Uy y= 2 sine, 


(c) @& Starting from the equation y = sin z, we multiply x by 2 to 
obtain the equation y = sin 2x. So we have a scaling with \ = 1/2 
and u = 1; that is, a scaling by a factor of a half horizontally. © 


296 


1 Real functions and their graphs 


Since the graph of y = sin (« + =); as shown in the solution to Worked 


Exercise A78(a), is obtained by translating the graph of y = sin z to the 
left by the distance 7/2, it is the same as the graph of y = cos x. In other 


. T 
words, cos xz = sin (« + =) for each real number z. 


Exercise A142 


Determine how the graph of each of the following trigonometric functions 
can be obtained by transforming the graph of y = cos x, and hence match 
each function to its graph. 

(a) y = cos (5) (b) y=2cosx (c) y=2cos2z 

(d) y = $cosz 


YA YA 
2 i 
2 
E T xe aG 
TE T x = =3 T 
ae 
A B 
YA YA 
i 1 
= Jo =a n 
—?2- 
C D 


Each of the graphs in Worked Exercise A78 and Exercise A142 involved 
either a scaling or a translation. The graph in the next worked exercise 
involves both. 


297 


Unit A4 Real functions, graphs and conics 


298 


Worked Exercise A79 


The graph of y = |x| is shown below. 


y y = |z] 


ay 


Sketch the graph of y = —2|x + 1| — 1. 


Solution 


®. Here we have both a scaling and a translation, so we have to take 
care with the order in which we apply them. To do this, we think 
about how the equation y = —2|x + 1| — 1 is obtained from the 
equation y = |z|, and apply the corresponding transformations to the 
graph in the same order. ® 


Starting with the equation y = |x|, we multiply the right-hand side 
by —2 to obtain the equation y = —2|xz|. We then replace x by x + 1 
in this equation to obtain y = —2|x + 1], and finally we add —1. 


So the graph of y = —2|x + 1| — 1 is obtained from the graph of 
y = |x| by applying a (1, —2)-scaling followed by a 
(—1,—1)-translation, which gives the graph sketched below. 


Exercise A143 


Determine how the graph of each of the following quadratic functions can 
be obtained by transforming the graph of y = x?, and hence match each 
function to its graph. 


(a) y = (z — 2)? +1 (b) y= (z +2)} +1 (c) y=—(a@— 2)? +1 
(d) y=(x-2)} -1 


yt ut ut ut 
5 5 
3 
14 1- 14 
IM ž 3, 2 23 2 3 | 3 
= 
A B C D 


2 Graph sketching 


In this section you will be sketching the graphs of functions that are rather 
more complicated than the basic functions you met in the previous section. 


Remember that the aim of sketching the graph of a function is to provide a 
visual summary of the main properties of the function. Here you will learn 
techniques for analysing the properties of functions, and see how these 
techniques can be combined into a general strategy for sketching their 
graphs. 


To begin our investigation of the main properties of functions, let us look 
carefully at the graph of the function 


f(x) = — 


~ T= 2) 


2 Graph sketching 


299 


Unit A4 Real functions, graphs and conics 


300 


By our convention, the domain of this function is the set of all real 
numbers excluding 1 and —1; its graph is sketched in Figure 18. 


YA 


| 
= 
— 
Sly 


Figure 18 The graph of y = 1/(1 — z?) 


Several key properties of the function f can be seen from this graph. 


e The function is undefined at x = 1 and at x = —1; that is, its domain 
consists of the three intervals 


(—o0, =i); (=1, 1) and (1, 00). 
e The graph is symmetric about the y-axis. 


e The graph of f crosses the y-axis when y = f(0) = 1; 
The graph of f does not cross the x-axis. 


e f takes positive values on the interval (—1, 1); 
f takes negative values on the intervals (—oo, —1) and (1, 00). 


e f(x) increases as x increases on the intervals (0,1) and (1,00); 
f(x) decreases as x increases on the intervals (—oo, —1) and (—1, 0); 
f has a local minimum at x = 0. 


e f(x) becomes very large and positive as x approaches 1 from the left 
or —1 from the right; 
f(x) becomes very large and negative as x approaches 1 from the right 
or —1 from the left; 
f(x) gets closer and closer to 0 as x becomes large and positive or large 
and negative. 


When sketching the graph of a function, you should concentrate on 
representing important features like those listed above, since it is these 
features that help us understand the behaviour of functions. There is no 
need for a sketch to have the detailed accuracy of a computer plot. 


2.1 Determining features of a graph 


In this subsection we look in detail at each of the features you saw above 
in the graph of the function f(x) = 1/(1 — 2”), and discuss how these and 
related properties arise in the graphs of other functions. 


We will consider the properties and graphs of a wide range of functions, 
including general polynomial and rational functions. Recall that a 
polynomial function of degree n is a function of the form 


f(x) = ang” + an-18"Tt +--+ aT + ao, 


where a, # 0. The linear, quadratic and cubic functions you met in 
Subsection 1.2 are examples of polynomial functions of degree 1, 2 and 3, 
respectively. A rational function is a function defined by a rule of the 
form 


where both p and q are polynomial functions. The linear rational functions 
you met in Subsection 1.2 are a simple example of this type of function. 


We now discuss each of the main features of graphs of functions in turn. 


Domain 


When the domain of a function is not given, we use our convention and 
take the domain to be the set of all real numbers for which the given rule 
is applicable. So the domain is the set of all real numbers, excluding any 
numbers that give an expression that is not defined — for example, they 
might make the denominator of a rational function equal to zero, or give a 
negative number under a square root sign. 


When sketching a graph it is particularly helpful to express the domain of 
a function as a union of intervals, since this emphasises the intervals on 
which the function is defined, and identifies any points at which it is not 
defined. For example, you saw above that the function f(x) = 1/(1 — z?) is 
defined on the three intervals (—oo, —1), (—1,1) and (1,00) but is 
undefined at x = +1. 


2 Graph sketching 


301 


Unit A4 Real functions, graphs and conics 


302 


Symmetry features 


There are three distinct ways in which the graph of a real function may 
exhibit symmetry properties. These are illustrated in Figure 19. 


e The graph in Figure 19(a) is unchanged when reflected in the y-axis. A 
function whose graph has this property is called an even function. 


e The graph in Figure 19(b) is unchanged when rotated through the 
angle m about the origin. A function whose graph has this property is 
called an odd function. 


e The graph in Figure 19(c) is unchanged when translated along the z-axis 
by a distance p, but not when translated by any distance less than p. A 
function whose graph has this property is called a periodic function 
with period p. 


Figure 19 Graphs of (a) an even function, (b) an odd function and 
(c) a periodic function 
These properties can be expressed algebraically as follows. 


A function f is even if 
f(—x) = f(x), forall x in the domain of f. 

A function f is odd if 
f(—x) =-—f(x), for all x in the domain of f. 

A function f is periodic if there is a number p such that 
f(a+p)= f(x), for all x in the domain of f. 


The markings of x and —z on the z-axes in Figure 19(a) and (b) should 
help you see why the algebraic definitions of even and odd functions are 
correct. Notice that the domain of a function must be symmetric about 0 
for the concepts of evenness and oddness to make sense. 


You have met several examples of even and odd functions already. For 
example, the functions x +> x? and «+> cos, whose graphs are shown 
in Figure 20(a) and (b), are even functions. The function 


f(x) = 1/(1 — 27), whose graph we looked at earlier in this section, is also 
an even function. 


The functions z > z? and x +> sin z, whose graphs are shown in 
Figure 20(c) and (d), are odd functions. Note that both even and odd 
functions may also be periodic; this is true of x —> cosa and «+> sing, 
for example. 


Many functions are neither odd nor even. To show algebraically that a 
function is neither odd nor even, we just need to find one value of x in the 
domain such that f(—2) is equal to neither — f(x) nor f(x). 


y = cosz 


T T 
(a) (b) 
y y= r’ yY y=sing 
> 
x x 


Figure 20 Graphs of (a), (b) even functions and (c), (d) odd functions 


Exercise A144 


Identify which of the following is the graph of: 
(a) an odd function 
(b) an even function 


(c) a function that is neither odd nor even. 


YA YA 


2 Graph sketching 


303 


Unit A4 Real functions, graphs and conics 


304 


Intercepts 


An intercept is a value of x or y at which the graph y = f(x) of a 
function f meets the z- or y-axis, respectively. The x-intercepts are the 
solutions (if there are any) of the equation f(x) = 0. The y-intercept is the 
value f(0), if this exists. These features are illustrated in Figure 21. 


It is usually straightforward to find the y-intercept, but harder to find the 
x-intercepts, since this involves solving the equation f(x) = 0. It is not 
always possible to solve this equation algebraically, but it is usually 
possible to obtain estimates for the solutions by finding intervals of the 
domain in which the values of the function f change sign. 


y-intercept 


x-intercepts 


Figure 21 ‘The z-intercepts and the y-intercept of a function f 


Intervals on which a function is positive or negative 


We say that a function f is positive on a particular interval J if f(x) is 
positive for each value of x in I. So f is positive on the intervals for which 
the graph lies above the x-axis. For example, the function whose graph is 
shown in Figure 22(a) is positive on the open intervals (a,b) and (c, 00). 
The modulus function, whose graph is shown in Figure 22(b), is positive 
on the open intervals (—0o, 0) and (0, 00). 


Similarly, we say that a function f is negative on a particular interval I if 
f(x) is negative for each value of x in I. So f is negative on the intervals 
for which the graph lies below the x-axis. For example, the function whose 
graph is shown in Figure 22(a) is negative on the open intervals (—oo, a) 
and (b,c). The modulus function is nowhere negative. 


We say that a function f has a zero at x if f(x) = 0. So a function has a 
zero at each number x where its graph crosses or touches the x-axis, that is, 
at each of the x-intercepts of f. The function whose graph is Figure 22(a) 
has zeros at a, b and c, and the modulus function has just one zero, at 0. 


+ oy + 
y = |z| 


> 
T 


Figure 22 Sections of graphs where a function is positive or negative 


2 Graph sketching 


If f is a polynomial or a rational function, a useful technique for finding 
the intervals on which f is positive or negative is to construct a table of 
signs for f. For example, consider again the rational function 


1 
f(x) ~~ J= g2’ 
whose graph we looked at earlier. We can factorise f as 
1 


and construct the table of signs as follows. 


e Write each of the factors of f as the heading of a row in the table, and 
add a final row for the function f(x) itself. Include any factors from the 
numerator of a rational function, as well as from the denominator (in 
this example, the numerator is just 1). 


e For the column headings, write (in increasing order) the values of x for 
which the factors of f are equal to zero, and also the largest open 
intervals to the left and right of, and between, these values. 


For our function f, this gives the following blank table: 


a | (—co,-1) | -1| (-1,1) | 1 | (1,00) 


1 
l-a 
l+ 


f(z) 


Now, for each factor, we complete the table by writing 0, + or — to 
indicate whether the factor is zero, positive or negative for values of x in 
the range indicated by each column heading. Finally, in each column, we 
then use the signs of the factors to determine the sign of f. (Where f is 
undefined, we enter the symbol » to indicate this.) The resulting table of 
signs for f is as follows: 


x (—oo, —1) 1 | (-1,1) | 1 | G,co) 

1 + a T + + 
l-z + F F 0 — 
Lae — 0 + =F F 
sæl = lef + fal - 


From this table of signs we deduce that 
e f has no zeros 
e f is positive on the interval (—1, 1) 


e f is negative on the intervals (—oo, —1) and (1, 00). 


305 


Unit A4 Real functions, graphs and conics 


306 


If f is a quadratic function, and we are unable to factorise it, then we can 
sometimes find out whether it always has the same sign by completing the 
square, using the method you saw in Worked Exercise A77. For example, 


2x? + 122 +19 = 2(a? + 6r) + 19 
= 2((«+3)? — 9) +19 
= 2(x +3)? —18+19 
=2(x +3) +1. 


From this we can see that 27? + 12x + 19 is always positive, whatever the 
value of x. 


Exercise A145 


For each of the following quadratic expressions, complete the square and 
determine whether it always has the same sign, regardless of the value of x. 


(a) z? — 6r +11 (b) 3z? + 12g — 1 


Intervals on which a function is increasing or 
decreasing 


Look at Figure 23(a), which shows the graph of the integer part function. 
This graph looks like a staircase: it goes uphill in a sequence of steps. It 
never goes downhill: as x increases, the value of f(x) either stays the same 
or gets larger. In this module we say that a function with this property on 
a particular interval is increasing on that interval. So the integer part 
function is increasing on R. There is an analogous definition of 
decreasing. 


Now look at the graph in Figure 23(b), which has no flat sections. On the 
interval (—oo, d), the graph goes uphill: f(x) gets larger as x gets larger. 
So f is increasing on this interval. Moreover, on this interval f(x) increases 
in the usual sense of the word ‘increase’. If we wish to emphasise this, then 
we say that f is strictly increasing on (—oo, d). On the interval (d,e), 
the graph goes downhill with no flat sections, and we say that f is strictly 
decreasing on this interval. Finally, on the interval (e, o0), the graph goes 
uphill again, so f is strictly increasing on this interval. 


strictly 
increasin 
YA y= |z] strictly y 
= increasing 
eo y = f(z) 
@—O. > > 
x x 
~ increasing i 
eo strictly 
(a) (b) decreasing 


Figure 23 Sections of graphs where a function is increasing or decreasing 


Note that, according to our definitions, a function that is strictly increasing 
on an interval is also increasing on that interval, and similarly a function 


t 


hat is strictly decreasing on an interval is also decreasing on that interval. 


Exercise A146 


Identify which of the following is the graph of a function that is: 


( 
( 
( 
( 
( 


a) increasing but not strictly increasing 
b) strictly increasing 
c) decreasing but not strictly decreasing 
d) strictly decreasing 


e) increasing on part of its domain and decreasing on another part of its 


domain. 
YA YA 
2, eo 
1] eo z 
1 3 
T T—@-0; T T 
=e 1 2 
a 3 
A s B 
YA YA YA 
o—e |- 
1 
=] T x T T O—® T > 
J T —2-1 127 
=l —lj; o—e 
D E F 


The following is a formal statement of the definitions introduced in the 
discussion above. 


Definitions 
A function f is increasing on an interval J, if for all 71, £2 € J, 
if, <22, then f(x1) < f(z). 


A function f is strictly increasing on an interval J, if for all 
z1, £2 € Í, 


ian <i then jen) < f(a): 
A function f is decreasing on an interval J, if for all z1, £2 € J, 
if n < o emn jie) 27 (75): 


A function f is strictly decreasing on an interval J, if for all 
z1, £2 € Í, 


if 7; <2, then f (21) > f (x2). 


2 Graph sketching 


307 


Unit A4 Real functions, graphs and conics 


308 


For a differentiable function f, we can use the derivative f’ of the function 
to identify these intervals. You will meet more formally what it means for 
a function to be differentiable in the analysis units of this module. For the 
moment, you can assume that a function whose graph has no jumps or 
sharp corners is differentiable, and that you can find the derivative of the 
function by the usual methods of calculus. You will find a table of 
standard derivatives in the module Handbook, and also a list of the rules 
for differentiating functions that you have met in your previous studies. 


Since f'(x) is the gradient of the graph of f at the value x in the domain, 
we have the following facts. 


Increasing /decreasing criteria 


e If f'(x) > 0 for all x in an interval J, then f is strictly increasing 
on I. 


e If f'(x) <0 for all x in an interval J, then f is strictly decreasing 
on I. 


A point where the graph of a function changes from being strictly 
increasing to strictly decreasing is called a local maximum, because at 
such a point the value of the function is larger than at all nearby points. 
Similarly, a point where a graph changes from being strictly decreasing to 
strictly increasing is called a local minimum. For a differentiable 
function f, these are examples of stationary points of the function, that 
is, values of x at which f'(x) = 0. (The term stationary point is sometimes 
also used to refer to the corresponding point (x, f(x)) on the graph of f. 
The tangent to the graph at such a point is horizontal.) 


For example, Figure 24(a) shows the graph of a differentiable function with 
a local maximum and a local minimum. We see that, for this function, 


if x € (—oo,d) or x € (e, 00), then f'(x) > 0 and f is strictly increasing 


if x € (d,e), then f'(x) < 0 and f is strictly decreasing 


f'(d) = 0, and at x = d the graph of f changes from being strictly 
increasing to strictly decreasing, so f has a local maximum 


f'(e) = 0, and at x = e the graph of f changes from being strictly 
decreasing to strictly increasing, so f has a local minimum. 


local yA oh 
maximum a 
y = |z] 
y = f(z) 
x 
local local 
minimum minimum 


(a) (b) 


Figure 24 Local maxima and local minima of functions 


2 Graph sketching 


Note that functions that are not differentiable can still have local maxima 
or minima. For example, the graph of the modulus function, shown in 
Figure 24(b), has a local minimum at x = 0. For such functions, the local 
maxima or minima may need to be found without using calculus. 


A stationary point of a differentiable function need not be a local 
maximum or a local minimum. For example, consider f(x) = x’, graphed 3 
in Figure 25. Since f’(0) = 0, this function has a stationary point at 0, but 
it has neither a local maximum nor a local minimum there. In fact, it has T 
what we call a horizontal point of inflection at x = 0. 


We can check whether a stationary point is a local maximum, a local 
minimum or a horizontal point of inflection by using the following test. 


Figure 25 The graph of 
E 
First Derivative Test f(z) =z 


Suppose that a is a stationary point of a differentiable function f, so 
that f(a) = 0 


e If f’ changes from positive to negative as x increases through a, 
then f has a local maximum at a. 


e If f’ changes from negative to positive as x increases through a, 
then f has a local minimum at a. 


e If f’ remains positive or remains negative as x increases through a 
(except at a itself, where f'(a) = 0), then f has a horizontal 
point of inflection at a. 


Note, however, that f’ may do none of these things; for example, a 
constant function has f'(x) = 0 at all values x in the domain, so every 
point on its graph is a stationary point. 


If f is a polynomial or a rational function, we can construct a table of 
signs for f’ to determine the intervals on which f is increasing and 
decreasing, and at the same time determine the nature of any stationary 
points. We illustrate this by returning to the function 


1 
a, 
To find the intervals on which this function is increasing or decreasing, we 


use the quotient rule to differentiate f, which gives 
_ (1-2?) x 0-1 (—2z) 


f 
f (x) ~~ (1 = x2)? 
2x 
=T 
There is no need to factorise the term (1 — z?)?, because its value is always 
at least 0. 


309 


Unit A4 Real functions, graphs and conics 


310 


The table of signs for f’ is as follows. 


We find that 

e f has a stationary point at 0 

e f is increasing on the intervals (0,1) and (1, co) 

e f is decreasing on the intervals (—oo, —1) and (—1,0). 


We deduce that f has a local minimum at 0, by the First Derivative Test. 


Exercise A147 


Use a table of signs to find the intervals on which the function 
f(z) = xf — 227 +3 


is increasing and decreasing, and use the First Derivative Test to 
determine the nature of the stationary points. Give the value of the 
function at each of the stationary points. 


Asymptotic behaviour of functions 


For a function f, the term asymptotic behaviour refers to the behaviour 
of the graph of y = f(x) at the points of the graph for which the variable z 
or the variable y takes arbitrarily large values. 


For example, let us consider how to determine the features of the graph of 
the function f(x) = 1/(1 — x”) as x or y approaches oo. 


Figure 26(a) shows a plot produced by a computer that has used a 
‘join-the-dots’ approach to generate the graph of this function. The 


computer plot is inaccurate near the ‘missing’ points x = 1 and x = —1, 
since it gives the impression that the graph is a vertical line at x = 1 and 
x = —1, whereas we know that the function is not defined at these points. 


It is common for computer plots of graphs to give misleading results near 
such ‘difficult’ points. By contrast, Figure 26(b) shows a sketch of this 
graph which indicates the behaviour of the function f near the points 1 
and —1 by the use of broken vertical lines. 


(a) (b) 


Figure 26 Depictions of the graph of the function f(x) = 1/(1 — x°) (a) an 
inaccurate computer plot and (b) a more accurate sketch showing asymptotes 


An asymptote with an equation of the form x = a is a vertical 
asymptote. For example, in the graph in Figure 26(b), the lines x = —1 
and x = 1 are vertical asymptotes. 


An asymptote with an equation of the form y = b is a horizontal 
asymptote. For example, in the graph in Figure 26(b), the line y = 0 isa 
horizontal asymptote. 


A broken line is used to indicate an asymptote on the graph of a function, 
except when the asymptote coincides with one of the axes. Both features 
are shown in Figure 26(b), which has two vertical asymptotes, shown as 
broken lines, and a horizontal asymptote that coincides with the x-axis. 


The behaviour of a function f near a vertical asymptote 7 = a may take 
various forms. For the example above, we can describe the behaviour near 
the vertical asymptote x = —1 by saying that f takes arbitrarily large 
positive values as x tends to —1 from the right, which is written in symbols 
as 


f(z) œ, asx—>-lt, 
and read as 
f of x tends to infinity as x tends to —1 from the right. 


Similarly, f(x) takes arbitrarily large negative values as x tends to —1 
from the left, which is written in symbols as 


f(z) > -o0o, asx—>-l, 
and read as 
f of x tends to minus infinity as x tends to —1 from the left. 


You will see how intuitive statements of this nature can be formally 
defined in Book F Analysis 2 of this module. 


2 Graph sketching 


311 


Unit A4 Real functions, graphs and conics 


312 


Two more examples of asymptotic behaviour are illustrated in Figure 27. 


YA 


=n o 
(a) (b) 


Figure 27 Asymptotic behaviour: (a) vertical asymptote and (b) horizontal 
asymptote 


Figure 27(a) shows a vertical asymptote where 

f(z) 3 œ, asx—at and f(z)—>-œ, astra, 
and Figure 27(b) shows a horizontal asymptote where 

f(z) 7b, asx—>oo and f(x)>b, asx—-—oo. 


Figure 27(b) also shows that the graph of a function may cross a horizontal 
asymptote. 


Exercise A148 


Describe in symbols the behaviour of the function 


o= 


“I -z 


near the vertical asymptote at « = 1 and near the horizontal asymptote. 


There are other types of behaviour that a function may exhibit as the 
domain variable x takes large positive or negative values. For example, the 
function graphed in Figure 28(a) has 


f(z) œ, asx—-oco and f(x)>-o0, asx —ov, 
and the function in Figure 28(b) has 
f(z) -o0, asx—-co and f(xr)— oœ, ast —oo. 


YA YA 


Sy 
Ry 


(a) (b) 


Figure 28 Some other types of asymptotic behaviour 


For a function, such as f(x) = «7, where we have 
f(z) œ, asx—>oco and f(x) œ, asx —oo, 


we write 


f(z) œ, as %— +00. 


Exercise A149 


Describe in symbols the asymptotic behaviour of the functions whose 
graphs are given below. 


(a) YA (b) YA 


8 


=“ 
— 
È 
$ 
(S) 
8 
8 8 
8 


f(x) = —2x+ f(x) = —22° 


If f is a polynomial function of degree n, that is, is a function of the form 
f (£) = ant” + an—1z" | +--+ az + ao, 


where a, # 0, then we define the dominant term of f to be a,x”, the 
term with the highest power of x, and we call a, the coefficient of the 
dominant term. For example, the polynomial function 

f(x) = 4r? — 2x? +1 has dominant term 4x3, which has coefficient 4. 


A polynomial function f has no vertical asymptotes since it is defined for 
all x in R, and its asymptotic behaviour for large values of x is the same as 
that of its dominant term a,x". This behaviour is summarised as follows. 


2 Graph sketching 


313 


Unit A4 Real functions, graphs and conics 


If f is a rational function, that is, a function of the form 


f(z) = 22 


q(x)’ 
where both p and q are polynomial functions, then locating any vertical 
and horizontal asymptotes is an important step in sketching its graph. 


The vertical asymptotes of a rational function occur at the values of x for 
which q(x) = 0 and p(x) #0, if there are any such values. 


A rational function can have at most one horizontal asymptote, and if it 
does have one, then its graph approaches the asymptote arbitrarily closely 
as t — TOO. 


To find the behaviour of a rational function for large positive or negative 
values of x, and hence identify any horizontal asymptote, we compare the 
dominant term of the numerator p, say ang”, with the dominant term of 
the denominator q, say b,x": 


e ifn > m, then the rational function has no horizontal asymptote 
e ifn < m, then the line y = 0 is a horizontal asymptote 


e if n= m, then the line y = c is a horizontal asymptote, where c is the 
ratio of the coefficients of the dominant terms of the numerator and 
denominator, that is, c = an /bm. 


You will see this technique applied in the worked exercises in the next 
subsection. 


2.2 Strategy for graph sketching 


This subsection begins with a strategy summarising the basic features that 
a sketch of a graph should convey. This is followed by worked exercises 
illustrating the strategy, and some exercises for you to try. 


Strategy A3 Graph-sketching strategy 


To sketch the graph of a function f, determine the following features 
of f (where possible), and show these features in your sketch. 


1. The domain of f. 

2. Whether f is even, odd or periodic (or none of these). 
3. The x-intercepts and the y-intercept of f, if any. 

4. The intervals on which f is positive or negative. 
5 


. The intervals on which f is increasing or decreasing, the nature of 
any stationary points, and the value of f at each of these points. 


6. The asymptotic behaviour of f. 


The steps of this strategy are numbered for easy reference, and in this unit 
are referred to as ‘step 1’, ‘step 2’, etc. However, it is not necessary to carry 


314 


out the steps in the order given above, although it is important to begin by 
determining the domain of f. For example, if the domain is [3,9], then f is 
neither even nor odd, and you cannot find the behaviour of f as x > oo. 


For some functions, you will be able to obtain enough information to 
sketch the graph without including all the steps of the strategy. On the 
other hand, it is often useful to obtain information in more than one way, 
since this provides a check on your working. For certain functions, though, 
you may find that some steps in the strategy are not easy to carry out, in 
which case it is fine to omit them. 


One step that is in general quite tricky is testing whether a given function 
is periodic. All the periodic functions you will meet in this module involve 
a trigonometric function (sin, cos, tan, cot, sec or cosec), and you need not 
test for periodicity unless a trigonometric function appears in the rule of f 
and you can guess the period. However, note that not all functions 
involving a trigonometric function are periodic; this will be the case, for 
example, if the rule of the function also contains a non-periodic element 
that ‘overrides’ the periodic behaviour of the trigonometric function. 


When sketching graphs, you should choose the scales on your axes with 
care: usually, the scales should be the same on both axes, but it may be 
necessary to have unequal scales in order to display some key features of 
the graph — for example, when f(x) is much larger than z. 


We first use the strategy to sketch the graphs of polynomial functions. 


Worked Exercise A80 


Sketch the graph of the function 
f(x) = 42° + 32? — 62 + 4. 


2 Graph sketching 


315 


Unit A4 Real functions, graphs and conics 


316 


4, &. Because we cannot find the zeros of f, we cannot find the 
intervals on which f is positive or negative. © 


By step 3, the sign of f changes from negative to positive as x 
increases from —2 to —1. 


5. Differentiating gives 
f (@) r = 6e—6 
= 6(227 + x — 1) 
= 6(2r — 1)(x + 1). 


We construct a table of signs for f’. 


Thus 


e f is increasing on the intervals (—~00,—1) and (4,00) 
e f is decreasing on the interval (—1, 5) 


e f has stationary points at —1 and 2. 
By the First Derivative Test, we deduce that 


e there is a local maximum at xz = —1 with f(—1) =9 


e there is a local minimum at x = 5 with IG) = 2. 


®. The results of steps 4 and 5 show that the graph of f crosses 
the x-axis at only one point. © 


6. @. We consider the behaviour of the dominant term 4x?. ©& 


The power of x in the dominant term is odd and its coefficient is 
positive, so 


f(z) coasx—oo and f(x) 4 —co as x > —00. 


This information enables us to sketch the graph. 


®. Because of the steepness of the curve, it is convenient to use 
different scales on the x- and y-axes. & 


y = 4r? + 327 — 62 +4 


Rv 


Exercise A150 


Sketch the graph of the polynomial function 
f(z) = a* — 22? +3. 


(You carried out step 5 in Exercise A147.) 


Hint: In step 3, if you put t = 2”, then the expression becomes t? — 2t + 3. 


Next, we use the strategy to sketch the graphs of linear rational functions. 


Worked Exercise A81 
Sketch the graph of the function 


f(x) ==. 


Solution 
1. The domain of f is R — {1}. &. By our convention. & 


2. The function is neither even nor odd, since its domain is not 
symmetric about 0. 


®. Alternatively, f(—1) = 3, but f is not defined at z = 1. @ 
3. f(x) =0 when 2x — 3 = 0, so the x-intercept is 3. 
f(0) = —3/(—1) = 3, so the y-intercept is 3. 


4. We construct a table of signs for f. 


2 _ | oot) | 1 | (1,3) Go) 
2x2 —3 0 + 
æ= Í — 0 F al F 
EEE 


2 Graph sketching 


317 


Unit A4 Real functions, graphs and conics 


So 
e f is positive on the intervals (—oo, 1) and (3, 00) 
e f is negative on the interval (1, 3). 

5. By the quotient rule, 

(x — 1)2 — (2x — 3)1 


/ 
f (x) cai (x = 1)? 
= 1 
C= 
The derivative f’ is undefined at 1, and f'(x) > 0 for x < 1 and 
ge > il, “Was 


e f is increasing on the intervals (—oo, 1) and (1, 00) 
e f has no stationary points. 


6. The denominator of f(x) is 0 when x = 1, so the line z = 1 is a 
vertical asymptote. 


®. From step 4, f(x) is positive as x tends to 17, and f(z) is 
negative as x tends to 1+. & 


Thus, by the results of step 4, 
f(z) + asr >l and f(z) > -coasz— 1. 


®. To find the behaviour of f for large values of x, we compare the 
dominant term in the numerator, 27, with the dominant term in 
the denominator, x. Here the powers of x in these terms are equal, 
so there is a horizontal asymptote. © 


The dominant term of the numerator is 2x7, with coefficient 2, and 
the dominant term of the denominator is x, with coefficient 1. The 
power of x in the two dominant terms is the same, so the line 

y = 2/1 = 2 is a horizontal asymptote. 


This information enables us to sketch the graph. 


YA 
vel | 20 =3 
(0, 3) it z—l 
y=2 
2 
i 7 


318 


We could have rewritten the function in Worked Exercise A81 as 


je) = eD 


= ~_\ 42 
T z—1 i 


and deduced that the graph of f can be obtained from the graph of 

y = 1/zx by a (1, —1)-scaling followed by a (1, 2)-translation. However, this 
translation and scaling are easier to deduce with hindsight, after sketching 
the graph, so unless a translation and/or a scaling of a standard graph are 
obvious, it is easiest to follow the strategy. 


Exercise A151 


Sketch the graph of the linear rational function 


xr—3 
f(a) = 5. 


Next, we sketch the graph of a more complicated rational function. 


Worked Exercise A82 


Sketch the graph of the function 


_ a? —5r+4 
— tba +4 


f(z) 


2 Graph sketching 


319 


Unit A4 Real functions, graphs and conics 


320 


So 
e f is positive on the intervals (—oo, —4), (—1,1) and (4,00) 
e f is negative on the intervals (—4,—1) and (1,4). 


. ®. Use the quotient rule to differentiate f. This is simpler if we 


rearrange the expression for f first, but it is not essential to spot 
this. & 


We have 
a? —5a+4 10x 
f(z) = w+ba+4 g2+5¢44 
Thus, by the quotient rule, 
fle) = (x? + 5x + 4)(—10) — (—10x)(2z + 5) 
(x? + 5a + 4)? 
= 10(a* — 4) 
(#2 + 5a + 4)2 
_ 10(@ — 2)(a + 2) 
KOTUA 


We construct a table of signs for f’. 


T 4 | (—4, —2) 
10(x — 2) $ 
r+2 |-|- — + 
(+1) |+| + + F 
(c+ 4)? |+| 0 + F 
r ee + 0 — * — O;+ 


So, using this table and the First Derivative Test, we deduce that 


e f is increasing on the intervals (—oo, —4), (—4,—2) and (2, co) 


e f is decreasing on the intervals (—2,—1) and (—1, 2) 
e f has stationary points at —2 and 2 


e there is a local maximum at x = —2 with f(—2) = —9 
e there is a local minimum at x = 2 with f(2) = —4-. 

. The denominator is 0 when x = —4 or x = —1, so the lines x = —4 
and x = —1 are vertical asymptotes. 


Thus, by the results of step 4, 


f(z) ~oasx>-4 and f(x) > — as gz > —4'; 
f(z) -ooasx—>-1- and f(z) o asr =~ I. 


®. To find the behaviour of f for large values of x, we compare the 
dominant term of the numerator with the dominant term of the 
denominator. Here the powers of x in these terms are equal, so 
there is a horizontal asymptote. ©& 


The dominant term in both the numerator and the denominator 

is x7. Thus the power of x is the same in each case. The ratio of 
the coefficients of the dominant terms is 1. Therefore the line y = 1 
is a horizontal asymptote. 


This information enables us to sketch the graph. 


@®. Here the difference in the y-coordinates of the stationary points 
makes it hard to draw all the graph’s features clearly, so we 
exaggerate the vertical scale at the local minimum. .® 


Exercise A152 


Sketch the graph of the rational function 


1 
FO ee 


The same ideas can be used to sketch the graph of a function that is not 
rational or polynomial, as shown by the application of Strategy A3 in the 
next worked exercise. 


2 Graph sketching 


321 


Unit A4 Real functions, graphs and conics 


322 


Worked Exercise A83 


Sketch the graph of f(x) = 


1 
V1+ x2 


Solution 
1. The domain of f is R. 
2. f is even, since, for all x in R, 
1 il 
Diener a ieee a 
3. The equation f(x) = 0 has no solution, so there are no x-intercepts. 
The y-intercept is f(0) = 1. 
4. @. There is no need for a table of signs. © 
f is positive on R. 
5. @. Use the chain rule to differentiate f(x) = (1 + x?) » 
By the chain rule, 
fi(@) = -5 (1+ 2°) 720) 
T 
=~ deal 
so the denominator of f’ is always positive, and therefore 
f'(z)=0 when z=0 
f(z) <0 whenz>0 
f'(z)>0 whens <0. 
We deduce that, 
e f is decreasing on the interval (0, 00) 
e f is increasing on the interval (—co, 0) 
e f has a stationary point at x = 0. 
By the First Derivative Test, we deduce that there is a local 
maximum at x = 0. We have f(0) = 1. 
6. Œ This is not a rational function, but similar arguments work here 


to find a horizontal asymptote. As x become large and positive, 
the 1 in the denominator V1 + x? becomes insignificant compared 
to the value of x”, so the function f behaves in a similar way to 
g(t) =1/2. * 

As x becomes large and positive the function behaves in a similar 
way to the more familiar function g(x) = 1/x. Therefore the line 
y = 0 is a horizontal asymptote. 


The function is even so we have 


f(z) 30 asa — +o. 


In this section we have used the First Derivative Test to determine 
whether a given stationary point is a local maximum, a local minimum or 
neither. You may have met the following alternative test for a local 
maximum or local minimum. 


Second Derivative Test 


Suppose that a is a stationary point of a differentiable function f, so 
ihai f(a) = 0. 


e If f’(a) <0, then f has a local maximum at a. 


e If f”(a) > 0, then f has a local minimum at a. 


This test can be very efficient as a means of classifying stationary points. 
However, for some functions it is too complicated to find the second 
derivative. Moreover, if f” (a) = 0, then the Second Derivative Test gives 
no result: the stationary point may be a local maximum, a local minimum, 
or neither. This is why Strategy A3 uses the First Derivative Test. 
However, it is fine for you to use the Second Derivative Test when it is 
convenient. 


3 New graphs from old 


In this section you will extend your graph sketching capabilities to include 
the graphs of combinations of functions, including sums and products of 
two functions, composite functions and hybrid functions. 


3 New graphs from old 


323 


Unit A4 Real functions, graphs and conics 


324 


3.1 Further graph-sketching techniques 


We start by looking at some techniques for sketching the graph of a 
combination of two functions, one of which is a trigonometric function. 

To do this, we follow the steps of Strategy A3 as far as possible, but in 
some cases we find that part or all of some steps are not necessary, or too 
tricky to apply. We can also exploit known features of the trigonometric 
functions, such as the fact that the values of sin x and cos x oscillate (with 
period 27) between the values 1 and —1. Because of this oscillation, it is 
often convenient to use other simple graphs as construction lines for the 
graph we are sketching. So, for this subsection, we add another step to 
Strategy A3 as follows. 


Strategy A4 Extended graph-sketching strategy 


To sketch the graph of a function f, determine the following features 
of f (where possible), and show these features in your sketch. 


1. The domain of f. 

2. Whether f is even, odd or periodic (or none of these). 
3. The x-intercepts and the y-intercept of f, if any. 

4. The intervals on which f is positive or negative. 
5 


. The intervals on which f is increasing or decreasing, the nature of 
any stationary points, and the value of f at each of these points. 


=>) 


. The asymptotic behaviour of f. 


7. Any appropriate construction lines, and the points where f meets 
these lines. 


The following worked exercise illustrates Strategy A4. 


Worked Exercise A84 


Sketch the graph of the function 


f(z)=zsinz. 


®. Although this function involves a trigonometric function, it 
seems unlikely to be periodic because of the factor of x, so we omit 
this test. ® 


. We have f(x) = 0 when z = 0 or when sing = 0. 
@. For x > 0, singz = 0 when z =0,7,27,.... # 
For x > 0, the x-intercepts are 0,7, 27,.... 

The y-intercept is 0 since f(0) = 0. 


. The intervals on which f is positive or negative (for x > 0) 
alternate between the x-intercepts in the same way as for the sine 
function. 


Fors > (0), 
e f is positive on (0,7), (27,37),... 
e f is negative on (m, 27), (3a, 47),.... 


. Œ. Differentiating gives f'(x) = sin z + z cos x, but f'(x) = 0 is not 
easy to solve. We can obtain enough information to sketch the 
graph from other steps. ® 


f'(x) = sin x + x cos x, so we omit solving f'(x) = 0, as it is not 
easy. 


. ®& The function f has no asymptotes as it is defined for all values 
of x and does not tend to a limit as x tends to too. © 


The function f has no asymptotes. 
. ®. Here we use what we know about the sine function. ©& 


Since —1 < sinx < 1, for all real numbers x, we have 
in < ising Sw, wow ae > (0), 

That is, 
=e=< f(e)< 2, tora > 0, 


so, for x > 0, the graph of f lies between the lines y = x and 
y = —«x. These are the construction lines for this function. 
The function f, for x > 0, has the following features: 

P= when cma — 0 

f(x) = —x when sing = —1. 


®@. sinz = 1 when z = 7/2, 57/2,..., and sing = —1 when 
x = 3n/2,77/2,.... S 


For x > 0, the graph of the function f 


e meets the construction line y = x when x = 77/2, 57/2,... 


e meets the construction line y = —x when x = 32/2,77/2,.... 


3 New graphs from old 


325 


Unit A4 Real functions, graphs and conics 


This information enables us to sketch the graph. 


@. We start by drawing the construction lines y = x and y = —2. 
Then we draw dots to indicate the points where the graph of f meets 
these construction lines and where it crosses the x-axis. We complete 
the sketch by drawing a smooth curve through these points and then 
reflecting the graph in the y-axis. © 


The sketch produced in the worked exercise above does not give the precise 
positions of the local maxima and minima of the graph of the function: it 
is not a precise drawing, but a sketch indicating the general behaviour and 
most of the important features. It can be shown that the dots on y = x 
and y = —x are not actually the local maxima and minima, as the more 
accurate enlargement in Figure 29 illustrates. 


y= «sine 


Figure 29 Enlargement of the graph of the function f(x) = xsin x 


326 


Exercise A153 


Sketch the graph of the function f(x) = z cos zx. 


Exercise A154 


Sketch the graph of the function f(x) = x + sin z. 


So far in this section we have considered combinations of functions 
involving sums and products. We now briefly consider the graphs of some 
composite functions. 


Recall that a composite function is a function, such as f(x) = sin(1/z), 
that can be obtained by applying first one function (here, x —> 1/x) and 
then another function (here, x + sin x). 


We can use the extended graph-sketching strategy (Strategy A4) to sketch 
the graphs of some composite functions, including f(x) = sin(1/z). 
However, we begin by noting that there are some composite functions 
where the properties of one of the functions allows you to ‘spot’ the 
behaviour of the composite function without needing to work through the 
strategy. The next exercise is one such case. 


Exercise A155 


Sketch the graph of the composite function 


f(x) = [sin z|. 


Note that in Exercise A155 it would not have been possible to use calculus 
to find the local maxima and minima because the function f(x) = | sin z| is 
not a differentiable function (its graph has sharp corners). 


The remainder of this subsection is useful preparation for the analysis 
units of this module, but you will not be assessed on it at this stage. If you 
are short of time now, you could read the rest of this subsection quickly 
and revisit it later. 


If a composite function involves a trigonometric function, then we can 
exploit known features of the trigonometric function in sketching its graph, 
just as we did in some of the earlier worked exercises. 


Consider the function 
al 
f(x) = sin = 


We can apply the strategy, and the first few steps follow in much the same 
way as in Worked Exercise A84: the domain of f is R — {0}, the function 
is odd and there is no y-intercept since f is not defined when x = 0. 


3 New graphs from old 


327 


Unit A4 Real functions, graphs and conics 


328 


The x-intercepts are the values of x for which f(x) = 0; that is, when 
sin(1/x) = 0. For x > 0, this is when 1/xz = 7, 27, 37,..., so the 
x-intercepts are 

1 T. 1 

T 
Thus a big difference between the graph of this function and the one in 
Worked Exercise A84 is that here the x-intercepts, or zeros, become ever 
closer together as x approaches 0 from the right. It follows that, as x 
approaches 0 from the right, the intervals on which f is positive or 
negative become progressively smaller, and the local maxima and minima 
become progressively closer together: the oscillations of the graph bunch 
closer and closer together as x approaches 0 from the right. 


For x > 0, it can be shown that f 


i iti 1 d 1 1 1 1 
e — _, — Sa eee 
is positive on 700 , and on 3m on) 5w ag 


; ti 1 1 1 1 
is negative on ra ere T | ee 


e 
2 2 ; 
e has maxima when z = —, —,—,... with f(x) = 1, 
nm 5T On 

has minima wh ages nafs 

° = —,—,—,... wi =— 
as minima when £ = 3-47 4777 x ; 

e tends to 0, as £z > oo. 
Therefore the lines y = +1 are construction lines, and y = 0 is a horizontal 


asymptote. 

A sketch of the graph of y = sin(1/z) is shown in Figure 30. Note that the 
function is not defined at the origin, and it is not possible to sketch the 
graph in the region close to the origin where the oscillations become closer 
and closer together. 

You will meet this function again in the analysis units in Book D 

Analysis 1 of this module. 


YA 
1 


1 
Figure 30 A sketch of the graph of the function f(x) = sin — 
x 


We finish this subsection by looking at another function which is not 
defined when x = 0. This function is a quotient of two functions, one of 
which is a trigonometric function: 

_ sing 


f(z) = —. 


T 


Again, we can apply the strategy, and the steps follow in much the same 
way as in Worked Exercise A84: the domain of f is R — {0}, the function 
is even and there is no y-intercept, since f is not defined at x = 0. 


The x-intercepts are the values of x for which (sin x)/x = 0; that is, when 
x =7,27,37,.... The intervals on which f is positive or negative alternate 
between these x-intercepts in the same way as for the sine function. 
Now, —1 < sinz < 1, so 

1 _sing _1 


-- < = <L, 
£ x x 


Therefore, for x > 0, the curves y = 1/x and y = —1/z are construction 
lines. 


This is almost enough information to sketch the graph of f, but what 
happens as x approaches 0? It turns out that 


sin x 


—> l as x —> 0, 


though you cannot deduce this from what you know so far. You will see a 
proof of this result in Book F. 


A sketch of the graph of y = (sin x)/x is shown in Figure 31. 


Figure 31 A sketch of the graph of the function f(x) = (sin x)/x 


The graph of this function has a ‘hole’ when z = 0, and it seems natural to 
fill this hole by defining f(0) = 1. In this way, we can extend the domain 
of this function to include 0: 


, & #0, 
l; t=0. 


sin £ 


g(x) = 


3 New graphs from old 


329 


Unit A4 Real functions, graphs and conics 


1 2 ue 


Figure 32 The graph of the 
hybrid function f 


330 


This is an example of a hybrid function; hybrid functions will be defined in 
the next subsection. You will meet this function g again in Book F, where 
you will see that, with this definition, the function is continuous at 0. 


However, for the function f(x) = sin(1/z) sketched in Figure 30, we cannot 
‘fill in the hole’ at x = 0 by defining an appropriate hybrid function: 
whatever value we assign to f(0), we cannot extend the domain of the 
function to the whole of R so that it is continuous at 0. You will also see 
this in Book D. 


3.2 Hybrid functions 


You have seen that the rule of a function is one of its main components, 
which may suggest that a function always has a single formula associated 
with it, but this is not the case. Some functions of the greatest practical 
importance are hybrid functions that have rules which are defined by 
different formulas on different parts of their domains. 


To specify a hybrid function, we need to state which rule applies on which 


part of the domain, and we use a curly bracket to list the different cases. 
For example, consider the function 


rozl 1<r<2, 


0, x<landaz>2. 


The function f has domain R, since f is defined for each x in R and for 
each such x, there is a unique value of f. It takes the value 1 on the 
interval (1, 2], and the value 0 elsewhere, as illustrated in Figure 32. 


Worked Exercise A85 


Sketch the graph of the function 


You will recognise that the graph of the function in Worked Exercise A85 
is not smooth: it has a ‘jump’ at « = 0 and a ‘corner’ at x = 1. The 
meanings of these features will be made precise in Books D and F. 


Exercise A156 


Sketch the graph of each of the following hybrid functions. 


x 


r’, <1 =o a 
7 (b) f(z)=4|z-1|, O<2<2 
t, £>1 


ga-2, x>2 
x, xr<0 


sinz, «>0 


3 New graphs from old 


331 


Unit A4 Real functions, graphs and conics 


Figure 33 The graph of the 
exponential function 


332 


4 Hyperbolic functions 


In this section you will revise hyperbolic functions and explore their 
properties. 


4.1 Properties of hyperbolic functions 


In Subsection 1.2 you met the graph of the exponential function f(x) = e”, 
often referred to as exp, which is shown in Figure 33. The function exp has 
the following properties which will be explained and discussed in greater 
detail in Books D and F: 


e the domain of exp is R 

e exp is not even, odd or periodic 

e e” > 0 for all x in R, so exp is positive on R 

e exp is its own derivative — that is, if f(a) = e”, then f'(x) = e” 
e since e” > 0 for all x in R, exp is increasing on R 

e e? = 1, e > 1 for all z > 0 and e < 1 forall x <0 

e e7tY = ee} for all x,y in R 

e e? > œ as z > œ and e” > 0 as z > —00 


e if n is any positive integer, then e” /x” — oo as z > ov. 


We sometimes express this final property by saying that e” grows faster 
than any polynomial when x is large. 


The following exercise gives you some practice in manipulating exponential 
terms. 


Exercise A157 


Simplify each of the following expressions so that it involves no products or 
quotients. 


(a) e” (e? 4 e7?) (b) (et _ e 28) /e? (c) (e52 + a 2 \ (a = eae) 


Certain combinations of e” and e~” appear so frequently in mathematics 
that it is useful to introduce functions that express these combinations 
more concisely. The functions that we need are the hyperbolic functions 
cosh, sinh and tanh, all of which have domain R: 


e cosh is the hyperbolic cosine function, with rule 
1 
cosh x = 5 (e?+e*) 
e sinh is the hyperbolic sine function, with rule 


sinh z = i (e” — e*) 


4 Hyperbolic functions 


e tanh is the hyperbolic tangent function, with rule 


sinh x 


tanh z = ; 
cosh x 


It is usual to pronounce ‘cosh’ as it is spelled, ‘sinh’ as ‘sinsh’ or ‘shine’, 
and ‘tanh’ as ‘tansh’ or ‘than’ (as in ‘thank’). 


The name ‘hyperbolic’ originates from the use of these functions as 
parametric forms for a hyperbola, a type of conic. You will meet conics and 
their parametric forms in Section 5. 


The first systematic development of hyperbolic functions was by 
Johann Heinrich Lambert (1728-1777) in the mid-eighteenth century, 
although it is now known that there is a link between hyperbolic 
functions and the formulas used by the Flemish cartographer 
Gerardus Mercator (1512-1594) in the construction of his map 
projection of 1569. In the nineteenth century, the widespread use of 
electricity led to an increased interest in hyperbolic functions due to 
their application in the transmission of electrical power. 


At first sight, the hyperbolic functions seem unrelated to the trigonometric 
functions, but in fact there is a very strong connection between them, 
which becomes apparent when we view the hyperbolic functions as complex 
functions, that is, functions whose domain is C. To see that such a 
connection exists, recall Euler’s Formula, 


Gerardus Mercator 


e”“=cosx+ising, forxrER, 


from Unit A2 Number systems. It follows that 
e*® = cos(—x) + isin(—x) 
= cos x — i sin z, 


and therefore 
Lyg ; 1 : . 
cos £ = 5 (e’* + | and sing = z (e* = e~) . 
i 


If you compare these expressions with the definitions of cosh and sinh, you 
will see that 


cosh(iz) = cosx and sinh(ix) = isin z. 


You will learn more about complex functions if you take your study of 
pure mathematics further. In this unit, you will only study the hyperbolic 
functions as real functions. You will see that in some ways they behave 
like the corresponding trigonometric functions, but in other ways they are 
quite different. 


The next two exercises demonstrate some similarities between the 
hyperbolic functions cosh and sinh and the trigonometric functions cos 
and sin. Note that cosh? x and sinh? x are abbreviations for (cosh x)? and 
(sinh x)”, respectively. 


333 


Unit A4 Real functions, graphs and conics 


334 


Exercise A158 
Using the definitions above, prove the following. 
(a) cosh? x — sinh? z = 1 
(b) cosh(a + y) = cosh z cosh y + sinh x sinh y 
(c) sinh(a + y) = sinh z cosh y + cosh z sinh y 
Hint: In parts (b) and (c), start from the right-hand side. 


Exercise A159 


Find the derivatives of the functions cosh x and sinh z, and compare your 
answers with the derivatives of cos x and sin x. 


As you might expect, we can also define three other hyperbolic functions: 


1 . 1 
sechz = ———, just as secz = ; 
cosh x COs x 
1 . 1 
cosech z = ———, just as cosecx = ——, 
sinh x sin x 
1 : 
cothz = ———, just as cot x = ; 
tanh gz tan g 


These functions are the hyperbolic secant function, the hyperbolic 
cosecant function, and the hyperbolic cotangent function, 
respectively. It is usual to pronounce ‘sech’ as ‘sesh’ or ‘sheck’, ‘cosech’ as 
‘co-sesh’ or ‘co-sheck’, and ‘coth’ to rhyme with ‘moth’. 


In Exercise A158 you met some identities involving cosh and sinh that are 
very similar to identities involving cos and sin. In fact, for every identity 
satisfied by trigonometric functions, there is a corresponding identity 
involving hyperbolic functions. A table comparing the most useful 
trigonometric and hyperbolic identities is included in the module 
Handbook. 


4.2 Graphs of hyperbolic functions 


We now turn our attention to sketching the graphs of the hyperbolic 
functions. You will see that they bear little or no resemblance to the 
graphs of the corresponding trigonometric functions. 


4 Hyperbolic functions 


Worked Exercise A86 


Sketch the graph of the function 
f(x) = cosh z. 


Solution 


®, Since cosh z = $(e* + e7”), we have to ‘take the average’ of the 
graphs of y = e” and y = e *: for each value of x, the required value is 
halfway between the values for these graphs. A sketch helps here. .©@ 


We use Strategy A3. 
1. f(x) = cosh x has domain R. 
2. f is even, since, for all z in R, 
f(—x) = cosh(—x) = 4 (e* + EE 
= i (e” + e*) = cose = fle) 


It is therefore sufficient to consider the features of the graph of f 
for x > 0, and then to reflect the graph in the y-axis. 


3. To find any x-intercepts of f we have to solve the equation 
5 (e*+e*) =0. 

However, e” and e`” are positive for all x in R, so cosh x is positive 

for all x in R. 

@. The entire graph lies above the z-axis. © 

Thus f has no «-intercepts. 

Also, f(0) = $ (e? + e7?) = $(1+ 1) = 1, so the y-intercept is 1. 
4. As shown in step 3, cosh x is positive for all x in R. 
5, f(z) =sinhz = 5(e* e 

@. The graphs of y = e” and y = e~* show that e” > e`” for 

x > 0, so sinhz > 0 for z > 0. # 

So f'(x) is positive when x > 0 and zero when x = 0, and 


e f is increasing on the interval (0, 00) 
e f has a local minimum at 0, with value cosh(0) = 1. 


6. Since e” > œ as z —> œ and e`” —> 0 as z > oo, 


cosha — œo as zt — oo. 


335 


Unit A4 Real functions, graphs and conics 


336 


So the graph of the cosh function bears little resemblance to that of the 
cosine function; for example, 


cosha>1, forall zinR, 
whereas 
—1 < cosx < 1, forallzinR. 


Moreover, unlike the cosh function, the cosine function is periodic with 
period 27, so its graph looks the same on successive intervals of length 27. 


By working through the next exercise, you will discover that there is also 
little similarity between the graphs of the sinh function and the sine 
function. 


Exercise A160 


Sketch the graph of the function 
f(x) = sinh z. 


Using the properties of the functions cosh and sinh, we can now sketch the 
graphs of their reciprocals, sech and cosech. Graphs of all the hyperbolic 
functions are given in the module Handbook for reference. 


Worked Exercise A87 


Sketch the graph of the function 


fir) =sehg= 


cosh z` 


4 Hyperbolic functions 


1. f has domain R, since cosh is never 0. 


2. f is an even function, since cosh x is an even function. It is 
therefore sufficient to consider the features of the graph of f for 
x > 0, and then to reflect the graph we obtain in the y-axis. 


3. We know that cosh x > 1 for all x in R, so, for all x in R, 
0 < sechaz < 1. 


So f has no x-intercepts. 

1 1 
—— => — >= Il, 
coshO 1 


so the y-intercept is 1. 


f(0) = sech 0 = 


4. As shown in step 3, f is positive for all x € R. 
5. Since cosh x is increasing on (0,co) and has a local minimum at 0, 


e sech x is decreasing on (0,00) 
e sechz has a local maximum at 0 with value sech(0) = 1. 


6. Since cosh xz — co as x — +00, we have 


sech x — 0 as x 4 oo. 


So y = 0 is a horizontal asymptote. 


This information enables us to produce the following sketch. 


YA 
(1,0) 
i T 


Exercise A161 


Sketch the graph of the function 


f(x) = cosech z. 


337 


Unit A4 Real functions, graphs and conics 


338 


5 Conics 


In this section you will revise conics: circles, ellipses, parabolas and 
hyperbolas. 


5.1 Conic sections 


In your previous studies you will have met three different ways to define a 
conic: by slicing a double cone with a plane; geometrically, using the 
focus-directrix definition, and algebraically, using an equation. You will 
review all three ways in this section. The three definitions are equivalent, 
but the proof of this is not given here. 


A conic section, or conic, is a curve obtained by slicing a double cone 
with a plane, as illustrated in Figures 34 to 37. The type of conic obtained 
depends on the orientation of the slicing plane, as indicated in the figures. 


circle 


Figure 34 Slicing a double cone to obtain a circle (plane perpendicular to 
axis) 


ellipse 


Figure 35 Slicing a double cone to obtain an ellipse (plane tilted slightly 
from the perpendicular) 


parabola 


Figure 36 Slicing a double cone to obtain a parabola (plane parallel to side 
of cone) 


hyperbola 


Figure 37 Slicing a double cone to obtain a hyperbola (plane tilted further) 


A degenerate conic section, or degenerate conic, is obtained when 
the slicing plane passes through the apex of the double cone. It may be a 
single point, a straight line or two intersecting straight lines, as illustrated 
in Figure 38. 


-o n — >x< 


Figure 38 Degenerate conics obtained by slicing a double cone 


A non-degenerate conic section, or non-degenerate conic, is a conic 
that is not degenerate. It may be a circle, an ellipse, a parabola or a 
hyperbola; sometimes a circle is considered to be a special type of ellipse. 


A circle can be defined geometrically as the set of points P such that the 
distance of P from a fixed point, the centre, is constant, as illustrated in 
Figure 39. 


P 


Figure 39 A circle 


An ellipse, a parabola, and a hyperbola can also all be defined 
geometrically as the set of points whose distance from a fixed point and a 
fixed line are related. (In this unit, when we refer to the distance between 
a point and a line, we always mean the shortest such distance.) It can be 
shown that the set of points P such that the distance of P from a fixed 
point is a constant multiple, e, of the distance of P from a fixed line is 


e an ellipse if0<e< 1 
e a parabola if e = 1 


e a hyperbola if e > 1. 


5 Conics 


339 


Unit A4 Real functions, graphs and conics 


The fixed point is called the focus of the conic, the fixed line is called its 
directrix, and the constant multiple e is called its eccentricity. These 
focus—directrix properties are illustrated in Figure 40. (It seems natural to 
use the letter e for eccentricity. Of course, this is quite unrelated to the use 
of e as the symbol for the irrational number 2.718..., mentioned in 
Subsection 1.2). 


M 
P 
ellipse (0 < e < 1) 
d 
M 
parabola (e = 1) P 
F 


hyperbola (e > 1) 


Figure 40 An ellipse, parabola and hyperbola obtained from a focus F and 
directrix d, with eccentricity e = PF/PM 


The circle has no focus-directrix property, though some texts consider the 
focus to be the centre of the circle and the directrix to be ‘at infinity’, so 
that the circle is obtained when e = 0. 


340 


The ellipse, the parabola and the hyperbola were given their names by 
the Greek geometer Apollonius (c.262—c.190 BCE) in his Conics, a 
work of eight books completed in about 200 BCE which completely 
reformed the ancient study of conic sections. Conics was translated 
by the astronomer Edmund Halley (1656-1742) and published in 
1710, see Figure 41. 


A (non-degenerate) conic is said to be in standard position if it is 
positioned in the plane as follows: 


For a circle: its centre is at the origin. 


For an ellipse: its axes of symmetry are the z- and y-axes, and its 
largest width is along the x-axis. 


For a parabola: its axis of symmetry is the x-axis, it passes through the 
origin and its other points lie to the right of the origin. 


For a hyperbola: its axes of symmetry are the x- and y-axes, and it 
crosses the x-axis. 


An ellipse, a parabola and a hyperbola all in standard position are 
illustrated in Figure 42. 


YA YA YA 


RY 
8 
RY 


(a) (b) (c) 


Figure 42 Conics in standard position: (a) ellipse (b) parabola and 
(c) hyperbola 


Any conic, lying anywhere in the plane, can be rotated and translated so 


that it is in standard position. The equation of a conic in standard position 


can be always be expressed in a straightforward and easily recognisable 


form. The equations of conics in other positions are more complicated; you 


will meet the general equation of a conic in Subsection 5.3. 


5 Conics 


| 
Figure 41 Frontispiece of 
Apollonius’ Conics 


341 


Unit A4 Real functions, graphs and conics 


342 


5.2 Conics in standard position 


In this subsection you will look individually at the circle, the parabola, the 
ellipse and the hyperbola in standard position and consider their 
focus—directrix definitions and equations. 


Circle 


In Unit Al you saw that the equation of a circle of radius a with its centre 
at the origin is 


a? + y? = a2. 


Such a circle is in standard position. 


Parabola (e = 1) 


A parabola is the set of points P in the plane whose distances from a fixed 
point F are equal to their distances from a fixed line d. 


A parabola is in standard position if 
e the focus F lies on the x-axis, with coordinates (a,0), where a > 0 
e the directrix d is the line with equation x = —a. 


The features of a parabola in standard position are shown in Figure 43. 


Xy 


d 
Figure 43 A parabola in standard position 


As shown in Figure 43, the origin lies on a parabola in standard position, 
since it is equidistant from F and d; it is the vertex of the parabola. The 
a-axis is the axis of the parabola, since the parabola is symmetric with 
respect to this line. 


The equation of a parabola in standard position with focus (a, 0) can be 
expressed as 


y? = dan. 


Exercise A162 


Sketch the parabola with equation y? = 2x. 


Ellipse (0 < e < 1) 


An ellipse with eccentricity e (where 0 < e < 1) is the set of points P in 
the plane whose distances from a fixed point F are e times their distances 
from a fixed line d. An ellipse is in standard position if 


e the focus F lies on the x-axis, with coordinates (ae,0), where a > 0 
e the directrix d is the line with equation z = a/e. 


An ellipse in standard position is symmetrical about the y-axis, so there is 
a second focus F> with coordinates (—ae,0), and a second directrix dz with 
equation x = —a/e. The features of an ellipse in standard position are 
shown in Figure 44. 


Ry 


(0, =p) 


Figure 44 An ellipse in standard position 


As shown in Figure 44, an ellipse in standard position intersects the x-axis 
at the points (+a,0), and intersects the y-axis at two points which we label 
(0,+b). It can be shown that a and b are related to the eccentricity e of 
the ellipse by the equation 


e=\/ l1- —. 


The four points (+a, 0) and (0, +b) are the vertices of the ellipse. The 
origin is the centre of the ellipse. The largest width is along the x-axis, 
that is, a > b> 0. The line segment joining the points (—a, 0) and (a,0) is 
the major axis of the ellipse, and the line segment joining the points 

(0, —b) and (0,6) is the minor axis of the ellipse. 


The equation of an ellipse in standard position intersecting the x-axis at 
+a,0) and the y-axis at (0, +b), can be expressed as 


—~ 


Exercise A163 


2 2 
Sketch the ellipse with equation > + > = 1, and find its eccentricity. 


5 Conics 


343 


Unit A4 Real functions, graphs and conics 


344 


Hyperbola (e > 1) 


A hyperbola is the set of points P in the plane whose distances from a 
fixed point F are e times their distances from a fixed line d, where e > 1. 
A hyperbola is in standard position if 


e the focus F lies on the z-axis, with coordinates (ae, 0), where a > 0 
e the directrix d is the line with equation x = a/e. 


A hyperbola in standard position is symmetrical about the y-axis, so there 
is a second focus F> with coordinates (—ae, 0), and a second directrix d2 
with equation x = —a/e. The features of a hyperbola in standard position 
are shown in Figure 45. 


Xy 


də 
Figure 45 A hyperbola in standard position 


As shown in Figure 45, a hyperbola in standard position intersects the 
x-axis at the points (+a, 0), which are the vertices of the hyperbola. The 
origin is the centre of the hyperbola. The hyperbola has two asymptotes, 
which are lines passing through the origin. If we write the equations of 
these lines in the form 


b 
y=+-2, 
a 


then it can be shown that a and 6 are related to the eccentricity e of the 
hyperbola by the equation 


/ b2 


The equation of a hyperbola in standard position intersecting the x-axis at 


(+a,0) and with asymptotes y = +(b/a)x, can be expressed as 
2 2 
a a 
a b 


A hyperbola in standard position with a = b has asymptotes y = £2, 
which are perpendicular lines. A hyperbola whose asymptotes are 
perpendicular is called a rectangular hyperbola. In Subsection 1.2 you 
saw that the graphs of linear rational functions are rectangular hyperbolas 
with asymptotes parallel to the x- and y-axes. An example of a 
rectangular hyperbola in standard position is shown in Figure 46. 


RY 


Figure 46 A rectangular hyperbola in standard position 


Exercise A164 


a? 
2 


Sketch the hyperbola with equation 7 = 1] showing the asymptotes, 


and find its eccentricity. 


5.3 General equation of a conic 


The equation of a conic in standard position can always be expressed in 
the straightforward and easily recognisable forms given in the last 
subsection. However, not all conics are in standard position. In this 
subsection you will meet the general equation of a conic. 


Let us begin by considering circles. In Unit A1 you saw that the equation 
of a circle with centre (a,b) and radius r is 


(z-a? +(y—6)? =r. (3) 
For example, the equation 
(x +1)? + (y—2)? =3 
represents a circle with centre (—1,2) and radius V3. 
We can multiply out the brackets in this equation to get 
r? Qe + 1a —4y +4 =3, 
that is, 
a? +y? + 2a —4y+2=0. 


5 Conics 


345 


Unit A4 Real functions, graphs and conics 


346 


In fact, if we have the equation of a circle in form (3), then we can always 
multiply out the brackets to write it in the alternative form 

ety +fr+gy+h=0, (4) 
where f, g and h are real numbers. We may also choose to multiply 
through by a non-zero constant (for example, we might want to do this to 
avoid unpleasant fractions in the equation); this will give an equation of 
the form 


ax? + ay* + fe +gy+h=0, (5) 
where a is a non-zero constant and f, g and h are real numbers. Note that 


the coefficients of x? and y? are equal in this equation, and that the values 
of f,g and h will in general be different from those in equation (4). 


Not every equation of this form represents a circle. For example, the 
equation x? + y? + 1 = 0 does not represent a circle since there are no 
points (x,y) satisfying it, and the equation x? + y? = 0 represents the 
single point (0,0). If we are given an equation of form (4) or (5), then we 
can determine whether it represents a circle, and, if so, find its centre and 
radius by using the method of completing the square, as demonstrated in 
the next worked exercise. 


Worked Exercise A88 


Show that the equation 
r? +y? —4¢+6y+9=0 


represents a circle, and find its centre and radius. 


If you want to complete the square in an equation of form (5), where 
a #1, then you can start by dividing through by a. 


Exercise A165 


For each of the following equations, determine whether it represents a 
circle, and, if it does, find the centre and radius of the circle. 


(a) z? +y? -— 2r — 6y+1=0 (b) £? +y? +r+y+1=0 
(c) z? +y? — 2r +4y+5=0 (d) 2r? +2? +z- 3y-5=0 


Now let us turn to conics in general. It can be shown that any conic, in 
any position in the plane, can be described by an equation of the form 


Az? + Bry + Cy? + Fe + Gy +H =0, (6) 
where A, B and C are not all zero. 
Notice that when A = C = 1 and B = 0 equation (6) is the equation of a 


circle in form (4). Also, each of the (easily recognisable) equations of the 
non-degenerate conics in standard position can be rearranged to form (6): 


y? =4ax can be expressed as 4ax — y? = 0 


2 2 2 2 

zr x 

5+1 can be expressed as “z+ yg -1=0 
2 2 2 2 

x x 

S-55! can be expressed as “a — yg -1=0. 


The equations of degenerate conics can also be expressed in form (6). For 
example: 


e x? +y? = 0 represents the single point (0,0) 


e x? — 2ry + y? = 0 represents the single line y = x, since 
x? — 2ry + y? = (x — y)? 


e x? — y? = 0 represents the pair of lines y = +2 since 

a? — y? = (x + y)(z — y). 
Not every equation of form (6) represents a conic. For example, you have 
already seen that there are no points satisfying the equation 
x? +y? +1 = 0, so in this case the equation describes the empty set. 
However, it turns out that every equation of form (6) represents either a 
conic or the empty set, so defining the empty set to be a degenerate conic 
yields the following theorem, which is stated without proof. 


Theorem A18 
Any conic has an equation of the form 
Az? + Bry + Cy + Fx+Gy+H =0, 


where A, B, C, F, G and H are real numbers, and A, B and Č are 
not all zero. Conversely, the set of all points in R? whose coordinates 
(x,y) satisfy an equation of this form is a conic. 


5 Conics 


347 


Unit A4 Real functions, graphs and conics 


348 


Given the equation of a non-degenerate conic, such as 


a? — Ary — 2y? + 6x + 12y + 21 = 0, 


we would like to be able to decide whether it represents an ellipse, a 
hyperbola or a parabola. (We know it is not a circle because of the 
non-zero term in xy.) A method of classifying non-degenerate conics from 
their equations will be established in Book C Linear algebra, where you 
will also meet the three-dimensional analogues of conics, which are known 
as quadrics. 


5.4 Parametrising conics 


Non-degenerate conics in standard position are not the graphs of functions, 
because for a function, each value of x in the domain must give rise to a 
single value of y in the codomain. For example, the unit circle is not the 
graph of a function, because if we take x = 0, for instance, then the 
equation x? + y? = 1 of the circle gives y = 1 and y = —1. 

We can, however, describe a conic using a function f whose domain is an 
interval and whose codomain is R?. To do this, we define a function of the 
form 


f:I— ee 
where J is an interval, such that the image set of f is the conic. 


Such a function is called a parametrisation of the conic, and in this 
subsection you will see parametrisations for each different type of 
non-degenerate conic in standard position. First, however, we review the 
idea of parametrisation by applying it to lines in the plane. 


Parametrising lines 


In Subsection 2.5 of Unit Al you saw that the following two sets are equal: 

{(c,y) €R?:22+y—-3=0} and {(¢+1,1—2t):te R}. 
The equation 2x + y — 3 = 0 can be rewritten as y = 3 — 2x, so the 
equality of the two sets above shows that, if 

x=t+1 and y=1-2t, (7) 
then the point (x,y) traverses the whole line y = 3 — 2x as t runs through 
all values in R. Thus the function 

f:R—-R 

t> (t+1,1-— 2t) 

maps R to the line y = 3 — 2x. We say that t is a parameter, the 


equations (7) are parametric equations for the line, and the function f 
is a parametrisation of the line, which we can also write as 


f(t) =(¢+1,1-2t), forteR. 


If we eliminate the parameter t by writing 
t=g= 1; 

we obtain 
y=1-2t=1-2(¢-1) =3- 22, 

as expected. 


Note that this parametrisation of the line y = 3 — 2x is not unique. To see 
this, suppose we define another parametrisation g as follows: 


g(t) = (2t,3 — 4t), forteER. 

This function corresponds to the parametric equations 
r=2t, y=3-4t. 

Again, we can eliminate the parameter t by writing 
b=) 2, 

so 
y = 3 — 4(xz/2) = 3 — 2x, 


as before. This shows that every point given by the parametrisation g is a 
point on the line y = 3 — 2x. On the other hand, if (a,b) is any point on 
the line y = 3 — 2x, then b = 3 — 2a, and for t = a/2 we have 


g(t) = g(a/2) = (2(a/2),3 — 4(a/2)) = (a,3 — 2a) = (a,b), 
so every point on the line y = 3 — 2x corresponds to some value of t. 


This shows that both parametrisations f and g give exactly the same line 
y = 3 — 2x. The difference between the two parametrisations is that the 
point (x,y) defined by the parametric equations traverses the line in 
different ways. Indeed, it is true in general that any line has many different 
parametrisations, where the line is traversed in different ways. 


Exercise A166 


(a) Sketch the line with the following parametrisation and find its 
equation in the form y = ma + c. 


f@)=(+1,t-1) forteR. 
(b) Show that 
g(t) = (2t,2t— 2) forteR 


is another parametrisation of this line. 


5 Conics 


349 


Unit A4 Real functions, graphs and conics 


We now look at how to obtain a parametrisation of a (non-vertical) line 
given two points on it. 


The line through any pair of points (p,q) and (r,s), where r Æ p, is given 
by 


q 

(x — p). (8) 
=P 
To parametrise this line, we set 


8 
y-q= 
m 


t= (x — p)/(r— p). (You would not be expected to think of this!) 
Rearranging gives x = p + (r — p)t. 
Then substituting t = (x — p)/(r — p) in equation (8) for the line gives 
y—q=t(s—q) so y=q+(s-q)t. 
These parametric equations 
c=pt(r—p)t, y=qt(s—qt 
correspond to the following parametrisation for this line 
a(t) = (p+ (r — p)t,q+ (s — q)t), fortinR, 


as illustrated in Figure 47. Here the symbol aœ has been used for the 
parametrisation function; any symbol can be used. 


YA 


(p+ (r — p)t,q + (s — q)t) 


z> 
x 


Figure 47 A parametrisation of the line though two points (p,q) and (r, s) 
A parametrisation of a vertical line through the point (p,0) is 
corresponding to the parametric equations 


Z=p, y=t. 


Exercise A167 


Consider the line through the two points (1,2) and (3,6). 
(a) Write down a parametrisation for this line. 


(b) Which values of the parameter t correspond to the points (2, 4), 
(7,14) and (0,0)? 


350 


Circles 


We now return to the question of how to parametrise conics, beginning 
with the unit circle. 


By the definitions of the sine and cosine functions, if P(x,y) is any point 
on the unit circle, and t is the angle in radians measured anticlockwise 
from the x-axis to the line OP, then 

£=wsi, y=sint. (9) 
As the angle t increases from 0 to 27, the point (x,y) travels once round 
the circle anticlockwise, starting and ending at the point (1,0), as shown in 
Figure 48. Note that even though t = 0 and t = 27 give the same point, it 
is conventional to include both values in the range for this parametrisation. 
Thus the function 


f : [0,27] — R? 
t — (cost, sin t) 


maps the interval [0,27] to the unit circle, as shown in Figure 49. 


f 


z A 


(cost, sin t) 


Figure 49 Mapping an interval to the unit circle 


Equations (9) are parametric equations for the unit circle, and the function 
f is a parametrisation of the unit circle, which we can also write as 


f(t) = (cost,sint), t € [0,27]. 

For this parametrisation, we can eliminate the parameter t by writing 
x=cost and y=sint, 

and using the trigonometric identity cos? t + sin? t = 1; this gives the 


equation x? + y? = 1, as expected. 


Exercise A168 


Mark on a sketch of the unit circle the coordinates of the points that 
correspond to the following values of the parameter t: 


L=a“/6, t=a/2, t=30/4, t=a, t=3r/2, t= 5r/3. 


5 Conics 


Yr 
t=7/2 (x,y) = 
(cost, sin t) 
1 
Pie Sa. 
t =\r t=27 «£ 
£=35/2 


Figure 48 A point on the 


unit circle determined by the 


angle t 


351 


Unit A4 Real functions, graphs and conics 


YA 
(x,y) = 
(acost,asin t) 


Sy 


Figure 50 A parametrisation 
of a circle of radius a centred 
at the origin 


352 


In general, the parametric equations for a curve have the form 


x = fı(t), y = falt), 
where fı and fə are real functions of the parameter t. The functions fi 


and fə have the same domain, which is usually an interval. 


For a single revolution of the circle, an appropriate interval is [0,27], as we 
saw above. Another appropriate interval for the circle is [—7,7], and a 
larger interval, such as [0, 47), will trace out the points of the circle more 
than once, in this case exactly twice each. Since a parametrisation is a 
function, every element in the domain must have a single image, but 
elements in the codomain may be the image of more than one element of 
the domain — the function need not be one-to-one. For example, in the 
case of the parametrisation of the unit circle using the interval [0,27] given 
above, the values t = 0 and t = 27 give the same point, (1,0). 


So far you have seen a parametrisation of the unit circle; that is, the circle 


of radius 1 in standard position. For a circle of radius a in standard 


position, as shown in Figure 50, which has equation x? + y? = a”, we can 


use the parametrisation 
a(t) = (acost,asint), t € [0,27]; 

this corresponds to the parametric equations 
x=acost, y=asint. 

For the parametrisation above, we can eliminate the parameter t by writing 
z/a=cost and y/a=sint, 


and using the trigonometric identity cos? t + sin? t = 1; this gives the 
equation r? + y? = a?, as expected. 


To find a parametrisation of a circle of radius a centred at the point (p,q) 
we apply a (p,q)-translation to all the points on a circle of radius a centred 
at the origin. Thus a parametrisation of a circle of radius a centred 


at (p,q) is 
x=pt+acost, y=q+asint, tE [0,27]; 
that is, 


a(t) = (p+acost,q+asint), t€ [0,27]. 


Exercise A169 


Write down a parametrisation for each of the following. 
(a) The circle centred at the origin, with radius 3. 


(b) The circle with centre (2,1) and radius 3. 


Another parametrisation for the unit circle is 
g(t) = (cos2rt,sin2rt), t€ [0,1] 
which corresponds to the parametric equations 
x=cos2rt, y=sin2zt. 
We have 
xr? + y? = cos? 2rt + sin? 2rt = 1, 
so (x,y) is a point on the unit circle. As t increases from 0 to 1, 2rt 
increases from 0 to 27, so the point (x,y) moves once round the circle. 


You have now seen two different parametrisations for the unit circle 
traversed once anticlockwise — namely 


f(t) = (cost,sint), t € [0,27], 
and 
g(t) = (cos 2rt,sin 2rt), t€ [0,1]. 


This illustrates the important fact that, just as you saw in the case of a 
line, a parametrisation of a given curve is not unique. Note that the point 
given by g(t) moves round the unit circle 27 times more rapidly than the 
point given by f(t). Different parametrisations of a curve may correspond 
to different modes of traversing the curve, and in general this may lead to 
a different starting point as well as a different pace or direction of travel. 


In the remainder of this subsection we will briefly review the main features 
of the usual parametrisations for an ellipse, a parabola and a hyperbola in 
standard position. 


Ellipse in standard position 


For an ellipse in standard position, we use the parametrisation 
a(t) = (acost,bsint), t€ [0,27]; 

this gives the parametric equations 
x=acost, y= Obsint. 


This parametrisation is illustrated in Figure 51. 


YA 
fash (acost, bsint) 
=T t=0 a 
(—a, 0) (a,0) 7 
t= 37/2 
(0, —b) 


Figure 51 A parametrisation of an ellipse in standard position 


5 Conics 


353 


Unit A4 Real functions, graphs and conics 


354 


For this parametrisation, we can eliminate the parameter t by writing 


x y : 
— = cost, = = sint, 
a 


b 


and using the trigonometric identity cos? t + sin? t = 1; this gives the 
equation 


r2 y? 


a2 b2 


as expected. 


=1, 


When b = a, the equation reduces to that of the circle centred at the 
origin, with radius a; that is, x? + y? = a?, as discussed above. 


Parabola in standard position 


For a parabola in standard position, we use the standard parametrisation 
a(t) = (at?,2at), teR; 

this gives the parametric equations 
1 at, y = 2at. 


This parametrisation is illustrated in Figure 52. 


al t>0 
(at?, 2at) 
(0, 0) z 
t<0 


Figure 52 A parametrisation of a parabola in standard position 
For this parametrisation, we can eliminate the parameter t by writing 
y? = (2at)* = 4a7t? = 4a x at? = 4az; 


this gives y? = 4azx, as expected. 


Hyperbola in standard position 


For a hyperbola in standard position, we use the parametrisation 
a(t) = (asect,btant), t€ |—7,7], excluding —7/2 and 7/2; 
this gives the parametric equations 
x=asect, y=btant. 


The values 7/2 and —7/2 are excluded since neither sec nor tan is defined 
at these angles. 


This parametrisation is illustrated in Figure 53. 


Figure 53 A parametrisation of a hyperbola in standard position 

For this parametrisation, we can eliminate the parameter t by writing 
x 
— = sect, Y- tant, 
a b 


and using the trigonometric identity sec? t — tan? t = 1; this gives the 
equation 


az bBo 


as expected. 


Exercise A170 


Write down a parametrisation for each of the following conics. 
2 2 2 2 
29 E 4 my 
@) =a W) EE 0 5-4 


(You sketched these conics in Exercises A162, A163 and A164, 
respectively.) 


Exercise A171 


Show that the points on the curve with parametrisation 


a(t) = (a cosh t, bsinht), tER, 


2 2 
lie on the right-hand half of the hyperbola with equation L -Z 
a 


This hyperbolic parametrisation links the hyperbolic functions with the 
hyperbola from which their name arises. 


5 Conics 


355 


Unit A4 Real functions, graphs and conics 


5.5 Summary: 


some standard 


parametrisations 


The following table gives a summary of the standard parametrisations for 


lines, circles and conics. 


Line 
through (p,q) and (r,s) 


a(t) = (p+ (r — p)t,q + (s — q)t), 
fr teR 


(right-hand half only) 


s=q (p,q) 
y-4= (x — p) > 
r—p | x 
Circle 
centre (0,0), radius a, a(t) = (acost,asint), 
Pip =a for t € [0, 27] 
Ellipse Y 
in standard position a(t) = (a cost, bsin t), b 
2 4p for t € [0, 27] =a a 
ae > 
Parabola y 
in standard position a(t) = (at?, 2at), 
y? = 4ax forte R a > 
Hyperbola Ka 
in standard position a(t) = (asect, btan t), y=--r ; 
ry? for t € [—7, 7], ee 
2 R` 1 excluding —7/2 and 7/2 bS : 
or =4 Po a z 
7 
a(t) = (acosht, bsinht), oe p \ 
forte R Y= r N 
a 


Learning outcomes 


Summary 


In this unit, you have seen how the properties of a real function can be 
analysed and represented using its graph. You have learned to recognise 
and sketch the graphs of basic real functions, including some whose graphs 
are smooth curves everywhere in their domains and others whose graphs 
have ‘corners’ or ‘jumps’ (for example, the modulus function). You have 
then seen how the graphs of basic functions are modified under 
translations or scalings, and met a strategy for sketching the graphs of 
more complicated real functions by looking at how various properties of 
functions affect their graphs. 


You have met the hyperbolic functions and seen that these functions are 
related to the exponential function, but have many properties analogous to 
the trigonometric functions, though they are not periodic. Finally, you 
have studied conic sections and seen how they can be expressed in the 
form of a function f : R —> R?, known as a parametrisation of the conic. 


You will continue your study of real functions in the analysis units of this 
module. 


Learning outcomes 


After working through this unit, you should be able to: 
e recognise and use the graphs of the basic real functions 
e understand the effect on a graph of translations and scalings 


e understand how the shape of a graph of a function indicates properties 
of the function, such as its being increasing, decreasing, even or odd 


e use the rule of a function to determine the main features of its graph as 
listed in the graph-sketching strategies 


e sketch the graphs of a variety of real functions 


e sketch the graph of a hybrid function, whose rule is defined by different 
formulas on different parts of its domain 


e define the hyperbolic functions cosh x, sinh z and tanh z, and be familiar 
with their properties 


e sketch the graphs of cosh z, sinh z and tanh z, and their reciprocals 


e explain the term conic section, and the focus—directriz definitions of the 
non-degenerate conics 


e sketch a conic in standard position from its equation 


e obtain the equations of lines and conics in standard position from their 
parametric representations. 


357 


Unit A4 Real functions, graphs and conics 


Solutions to exercises 


Solution to Exercise A138 


(a) The denominator of f(x) is 1 — 2”, which is 
zero when x = 1 or —1, so the domain is the 

set R — {—1,1} (that is, the set of all real numbers, 
excluding —1 and 1). 

(b) This function is defined for all real numbers, 
so the domain is R. 

(c) The denominator of f(x) is 

x? + 5x +4 = (x + 1)(x + 4), which is zero when 
x = —1 or —4, so the domain is the 

set R — {—1, —4}. 

(d) The denominator of f(x) is V1 — x?, which is 
zero when x = 1 or —1, and is not defined when 
x? > 1, that is, when x > 1 or x < —1, so the 
domain is the interval (—1, 1). 


Solution to Exercise A139 
(a) (—oo, —1) U (—1,1)U (1,00) 
(b) R = (—œ, 00) 
(c) (—00, —4) U (—4, 
(a) (—1,1) 


Solution to Exercise A140 


(a) We have f(x) = 2x? — 8x +11 

= 2(z? — 4z) +11 

=2((x— 2)? —4)+11 

=2(z— 2)? +3. 
Hence the parabola has vertex (2,3). So it is 
graph B. (Since the coefficient of x? is positive, the 
parabola is the same way up as the graph of 
y= 2") 
(b) We have f(x) = —2x? — 8x — 5 


1) U (-1,c) 


= —2(2? + 4x) —5 
= —2((x + 2)? — 4) -5 
= —2(x +2)? + 3. 


Hence the parabola has vertex (—2, 3). Since the 
coefficient of x? is negative, the parabola is the 
opposite way up to the graph of y = 2”. So it is 
graph D. 


358 


(c) We have f(x) = —2x? — 8x — 11 

= —2(g? +427) - 11 

= —2((2 +2)? —4)-11 

= —2(x + 2)? — 3. 
Hence the parabola has vertex (—2,—3). So it is 
graph C. (Since the coefficient of x? is negative, 
the parabola is the opposite way up to the graph of 
y= 2.) 
(d) We have f(x) = 2274+ 82+ 11 

= 2(x? + 4x) + 11 

= 2((2 + 2)? — 4) +11 

= 2(a + 2)? + 3. 
Hence the parabola has vertex (—2,3). Since the 
coefficient of x? is positive, the parabola is the 
same way up as the graph of y = x. So it is 
graph A. 


Solution to Exercise A141 

(a) Linear: 1, 22. 

(b) Quadratic: 2, 11, 21. 

(c) Cubic: 8, 15. 

(d) Trigonometric: 5, 6, 16, 19. 

(e) Linear rational: 7, 13. 

(£) Modulus (or related): 4, 10, 18. 
(g) Integer part (or related): 9, 23. 

(h) Exponential (or related): 12, 20. 


(i) Not the graph of y as some function of x: 3, 
14, 17, 24. 


Solution to Exercise A142 


(a) The graph of y = cos(a/2) is obtained from 
the graph of y = cos by a (2, 1)-scaling. This is 
graph D. 

(b) The graph of y = 2 cos x is obtained from the 
graph of y = cos x by a (1, 2)-scaling. This is 
graph C. 

(c) The graph of y = 2 cos 2x is obtained from the 
graph of y = cos g by a (5, 2)-scaling. This is 
graph A. 


(d) The graph of y = i cos x is obtained from the 
graph of y = cosx by a (1, ¿)-scaling. This is 
graph B. 


Solution to Exercise A143 


(a) The graph of y = (x — 2)? + 1 is obtained from 
the graph of y = x? by a (2,1)-translation. This is 
graph C. 

(b) The graph of y = (x + 2)? + 1 is obtained from 
the graph of y = x? by a (—2, 1)-translation. This 
is graph D. 

(c) The graph of y = — (zx — 2)? + 1 is obtained 
from the graph of y = x? by applying both a 
scaling and a translation. 

Consider y = x? and multiply the right-hand side 
by —1 to obtain the equation y = —x?. Then 
replace x by x — 2 to obtain the equation 

y = —(a — 2)? and finally add 1 to the right-hand 
side of this equation to obtain y = —(a — 2)? + 1. 
So the graph of —(x — 2)? + 1 is obtained from the 
graph of y = z? by a (1, —1)-scaling followed by a 
(2, 1)-translation. This is graph B. 

(d) The graph of y = (x — 2)? — 1 is obtained from 
the graph of y = x? by a (2, —1)-translation. This 
is graph A. 


Solution to Exercise A144 


(a) B and E are graphs of odd functions. 
(b) A and F are graphs of even functions. 


(c) C and D are graphs of functions that are 
neither odd nor even. 


Solution to Exercise A145 


(a) z? -—62+11=(¢-3)?-9+4+11 
= (x — 3)? +2 
which is always positive. 


(b) 32? +122 — 1 = 3(x? + 4x) — 1 
=3((x+2)} —4)-1 
= 3(x +2)? -12-1 
= 3(x + 2)? — 13 
which is sometimes positive and sometimes 
negative (for example, positive when x = 1 and 
negative when x = 0). 


Solutions to exercises 


Solution to Exercise A146 


(a) A is the graph of a function that is increasing 
but not strictly increasing. 


(b) E is the graph of a function that is strictly 
increasing. 


(c) F is the graph of a function that is decreasing 
but not strictly decreasing. 


(d) C is the graph of a function that is strictly 
decreasing. 


(e) B and D are graphs of functions that are 
increasing on part of the domain and decreasing on 
another part of the domain. 


Solution to Exercise A147 
f'(z) = 42° — 42 
= 4a(x* — 1) 
= 4z(x — 1)(x + 1). 


We construct a table of signs for f’. To save space, 
we omit the first and last interval headings. 


x —1 | (—1,0) | 0 | (0,1) } 1 

4z |-| — — 0 + +) + 
a-l1 O;+ 
xz+1]— | 0 -+ + + +) + 
f(z) |— 0 + 0 — 0| + 


Thus 
e f is increasing on the intervals (—1, 0) and (1, 00) 


e f is decreasing on the intervals (—oo,—1) and 
(0, 1) 


e f has stationary points at x = —1, 0 and 1. 
By the First Derivative Test, we deduce that 


e there is a local minimum at x = —1 with 
f(-) =2 
e there is a local maximum at x = 0 with f(0) =3 


e there is a local minimum at x = 1 with f(1) = 2. 


359 


Unit A4 Real functions, graphs and conics 


Solution to Exercise A148 6. The power of x in the dominant term is even 


f(a) {= and its coefficient is positive, so 
x) > œ as z > 


f(x) 3 —œ asx 1+ f(z) > œ, asz — +oo. 
| = oT i 9 This information enables us to sketch the graph. 
£ as £ —> —00. 
YA 
Solution to Exercise A149 y = zt — 22? +3 
(a) f(x) > œ as z > +00. (0,3) 
(b) f(x) > co as z > œ 
f(x) 4 —œ as z > —00. (-1,2) (1,2) 
(c) f(x) 4 —0o as z > +00. 
(d) f(z) + — as z > 00 > 
f(x) > œ as z > —00. 
Solution to Exercise A150 Solution to Exercise A151 
f(x) = z4 — 2z? + 3. f(a) = 2. 
-r 


1. The domain of f is R. 


2. f is even, since, for all x in R, 


1. The domain of f is R — {2}. 


2. f is neither even nor odd, since its domain is 


f(-2) = (—z)* — 2(-2)° + 3 not symmetric about the origin. 
=a" — 22” +3 = f(a). 3. We have f(x) = 0 only when z = 3, so the only 
3. Using the hint and completing the square we x-intercept is 3. 
get 1? — 2t +3 = (t — 1)? — 1 +3, so The y-intercept is f (0) = —3. 
f(z) = (z? -17° +2>2 4. We construct a table of signs for f. 


for all x in R, so f is positive on R. This means 
that f has no x-intercepts, as f(a) is never zero. 
The y-intercept is f(0) = 3. 

4. By step 3, f is positive on R. 

5. In Exercise A147 you found that 


e f is increasing on the intervals (—1,0) and 


e f is positive on the interval (2,3) 


(1,00) 
j ti he int ls (—oo, 2 
e f is decreasing on the intervals (—oo, —1) and j La. Salve On thè ner Mas 1—35; aand 
(0,1) ae 
e f has stationary points at x = —1, 0 and 1 5. Using the quotient rule, 
e there is a local minimum at x = —1 with f(a) = (2-2) + (2—3) = al 
f(-1) =2 (2— x)? (2— z)?’ 
e there is a local maximum at x = 0 with so f'(x) <0 for all x in the domain; that is, f is 
f(0) =3 decreasing on each interval of its domain. 
e there is a local minimum at x = 1 with 6. The denominator is 0 when x = 2, so the line 


fl) =2. 


x = 2 is a vertical asymptote. 


360 


Also, by step 4, 
f(z) => —O, 
f(x) > œœ, 


ast@—2°, 
as xr > 2”. 
The dominant term of the numerator is z and 
the dominant term of the denominator is ~x. 
Thus the power of x is the same in each 
dominant term. The ratio of the coefficients of 
the dominant terms is —1. Therefore the line 
y = —1 is a horizontal asymptote. 


This information enables us to sketch the graph. 


Solution to Exercise A152 
1 

f(x) = ry 

1. The domain of f is R — {0, —1}. 


2. f is neither even nor odd, since the domain is 
not symmetric about 0. 


3. The equation f(x) = 0 has no solution, so there 
are no x-intercepts. 


f (0) is not defined, so there is no y-intercept. 
4. We construct a table of signs for f. 


e f is positive on the interval (0,00) 


e f is negative on the intervals (—oo, —1) and 
(—1,0). 


Solutions to exercises 


5. Using the quotient and product rules, 


mao (@ +1)? + 2a(x +1) 
P=" Pari 


(a +1)(a + 1+ 2z) 
x?(a+1)4 


O 3x +1 
o æ+) 


so 
= _— ol 
f'(x) =0 when z = =z: 


We construct a table of signs for f’. 


1 1 1 
z 1| (-1-5) | -3 | (3,9) | 0 
(3a +1) + 0 - |-|- 
r’ + H + 0} + 
(z+1)} |-]| 0 + + + | fet 
fæ) j-i + Jo] = J*]- 


We deduce that 

e f is increasing on the interval (-1, —}) 

e f is decreasing on the intervals (—oo, —1), 
(—4,0) and (0, co) 

e f has a stationary point at x = —f. 


By the First Derivative Test, we deduce that 
there is a local maximum at x = —. We have 
G aes 
6. The denominator is 0 when x = 0 or x = —1, so 
the lines x = 0 and x = —1 are vertical 
asymptotes. 
Also, by step 4, 
f(x) + —00 
f(x) > -œ 
f(x) > —00 
f(x) > œ 
The power of x in the dominant term in the 
numerator (0, since 1 = z?) is less than the 
power of x in the dominant term of the 
denominator (3), so the line y = 0 is a 
horizontal asymptote. (Alternatively, by steps 4 
and 5, f(x) > 0 as z => +00.) 


as zr ——1` 
as x = —1* 
as 7 > 07 


asx — 07. 


This information enables us to sketch the graph. 


361 


Unit A4 Real functions, graphs and conics 


yt 7. Since —1 < cosx < 1 for all real numbers zx, we 
have 
—x<axcosx<a, forx>0. 


That is, 


—x < f(x) aa, “tora > 0, 


so, for x > 0, the graph of f lies between the 
lines y = x and y = —a. These are the 
construction lines for this function. 


_(-4 — 21) The function f, for x > 0, has the following 
features: 

f(x) =x when cosg = 1 

f(z) =—a when cosg = —1. 
For x > 0, the graph of f 


Solution to Exercise A153 e meets the construction line y = x when 
A 


f(z) = xcosx. e meets the construction line y = —x when 


1. The function f has domain R, since both x and L=7,37,57.... 


cosx have domain R. This information enables us to sketch the graph. 


2. The function f is odd, since for all x in R, 
f(-x) = —x cos(— x) Pe. 


= —xcosz = —f(z). 


YA po 


= 00s t 
We consider the features of the graph of f for z 


x > 0, and then rotate the graph we obtain 
through a about the origin. 


Although f involves a trigonometric function, it 
is not periodic because of the factor of x. 


3. We have f(x) = 0 when z = 0 or when 
cosx = 0. 


So the z-intercepts are 0,7/2,37/2,.... 
The y-intercept is 0 since f(0) = 0. 
4. The intervals on which f is positive or negative Solution to Exercise A154 


(for x > 0) alternate between the x-intercepts in f(x) =x +sinz. 


the same way as for the cosine function. 
y 1. The function f has domain R, since both x and 


For x > 0, sinz have domain R. 
e f is positive on 2. x +sinz is odd, since for all x in R, 
(0, m/2), (37/2, 51/2), (7/2, 97/2), Pay f( x) = -gr +4 sin( w) 


e f is negative on (7/2, 37/2), (57/2, 77/2),.... ee eee 


We consider the features of the graph of f for 
x > 0, and then rotate the graph we obtain 
6. The function f has no asymptotes. through 7 about the origin. 


5. f'(x) = cosg — xsin g, so we omit solving 
f'(x) = 0, as it is not easy. 


362 


Although f involves a trigonometric function, it 
is not periodic because of the addition of x. 


3. We have f(0) = 0, so 0 is both the x-intercept 
and the y-intercept. There are no other values 
of x for which f(x) = 0. 

4. f is positive on (0,00). 

5. f'(x) =1+cosz, so f'(x) = 0 when cosz = —1, 
that is, when x = (2k + 1)z, for any integer k. 
At all other points in (0,00), f'(x) > 0, so 
e f is increasing on (0,00) 

e f has stationary points when 
v= 7,37,57,.... 
By the First Derivative Test, we deduce that 
there is a horizontal point of inflection when 
v= 7,37,97,.... 
We have f(x) =7, f(3a) = 37,.... 
6. The function has no asymptotes. 


7. Since —1 < sing < 1 for all real numbers z, we 
have 


x-l<ax+sineg<2x+1, forr>0. 
That is, 
x—1< f(x)<x+l1, forr>0, 


so, for x > 0, the graph of f lies between the 
graphs of the functions y = x — 1 and y= z + 1. 


The function f, for x > 0, has the following 
features: 
f(z) =z when sing = 0 
f(z) =2+1 when sinz = 1 
f(x) =x—1 when sing = —1. 
So y = £z, y = x+ 1 and y = z — 1 can be used as 
construction lines, and for x > 0, the graph of f 
e meets the construction line y = x when 
g= OT 2i 
e meets the construction line y = x + 1 when 
xz = T /2, 57/2, 9T/2,... 
e meets the construction line y = x — 1 when 
x = 3n /2, TT /2,11T/2,.... 


This information enables us to sketch the graph. 


Solutions to exercises 


y =x +singt 


Solution to Exercise A155 


The modulus function maps any negative number 
to its corresponding positive value, so the zeros of 
f are exactly those of sin z, and for each interval 
on which 


e sinz is positive, we have f(x) = sin z 

e sinz is negative, we reflect in the x-axis to get 
f(x) = |sin z|. 

And for any integer k, there are 

e local maxima with value 1 at x = 4(2k + 1)r 

e local minima with value 0 at x = kr. 


We therefore obtain the following graph. 


y = |sin z| 


363 


Unit A4 Real functions, graphs and conics 


Solution to Exercise A156 (b) cosh g cosh y + sinh z sinh y 
L e 5(e¥ +e) 
+ 5(e” — e-*)5(e¥ — e”) 


Each of these functions has domain R. 


= 4 (e7e” + eTe-Y + eed + ete”) 
+ F(eTeY — ete” — e Ted + ete) 

= E (e®tY + e? 9 + e™?tY 4 e719) 
+ t(erty — e? ¥_e-T+Y Le (x+y) 

= (erty + e+) 

= cosh(x + y) 

(c) sinhacoshy + cosh z sinh y 

= 4(e® — e7”) $ (eY +7) 
+ 5(e? +e™”)i (et — e™”) 

= 4 (e®e” + e®e™” — e *e¥ — etet) 


+ $(e%e¥ — e?e™Y + eed — e` eY) 
= 4 (e” Y + e®-Y — e- Oty — eT (14) 


= sinh(z + y) 


Solution to Exercise A159 
Let f(x) = cosh x = $(e” + e™®); then 


ë é s ifaw T) a 
Solution to Exercise A157 F (2) = 3(67 — e™) = sinh z. 
(a) e®(e® + e7?) = e? + e? Let g(x) = sinh z = 5(e* —e *); then 
=e +1 g'(x) = 4(e” +e”) = cosh z. 


UE a fer eo Thus 


(c) (e5? + e752) (e5? = oe) 


/ : : / 
cosh = sinh and sinh’ = cosh. 
10a os e} + el PE e 10x 


=e 

— el0r _ ——10x These are similar to the trigonometric derivatives 
/ x sf 

F P cos =—sin and sin’ = cos 

Solution to Exercise A158 ? 


but differ by a minus sign in the first one. 
(a) cosh? z — sinh? x - i 


= te” | ge z(e” et)" 
= +(e?" J Det e—# j- e22) 

_ $(€2" — QeXe~F + go) 

t (e? +2 + e723 _ e?z LI a) 


7x4=1 


364 


Solution to Exercise A160 
f(x) = sinh z. 
1. sinh z has domain R. 


2. sinh v is odd, since 


= }(e™ — e”) 
=ar) 
= — sinh z = — f (x). 


It is therefore sufficient to consider the features 
of the graph of f for x > 0, and then to rotate 
the graph we obtain through 7 about the origin. 
3. sinhz = 5(e* — e~*) = 0 when e” = e~*; so the 
only zero of sinh z is 0. 
So 0 is both the x-intercept and the y-intercept. 
4. From the graphs of y = e” and y = e”, we 
observe that sinh x is positive for x > 0. 
5. We know that sinh’ x = cosh z, and also, for all 
x in R, that cosh x > 1, so sinh x is strictly 
increasing on R, and so has no stationary points. 


(Since sinh’ x = cosh x and cosh0 = 1, the graph 
of sinh x has gradient 1 at the origin.) 


6. Since e?” — œo as z — oande * — 0 as 
XL OO, 


sinha => œ as z —> oo. 
This information enables us to sketch the graph. 


YA 


y = sinh g 


Xy 


Solutions to exercises 


Solution to Exercise A161 


1 
sinh z` 
1. sinh x = 0 when x = 0, so cosech z is not defined 
at 0. Thus cosecha has domain R, excluding 0. 


cosech x = 


2. cosech x is odd, since sinh x is odd. It is 
therefore sufficient to consider the features of 
the graph of f for x > 0, and then to rotate the 
graph we obtain through m about the origin. 


3 and 4. We know that sinha > 0 for x > 0, so 
cosech x > 0 for x > 0, thus cosech x has no 
zeros since cosech x is not defined at x = 0, it 
has neither x-intercepts nor y-intercepts. 


5. We know sinh x is increasing on R, so 


e cosech x is decreasing on (0, 00) 


e cosechz has no local maxima or local minima. 


6. We know that when x = 0, sinh z = 0 and that 
sinh x —> œ as © > OO. 


Since cosech x = 1/sinhz, and sinh z is small 
when z is close to 0, the line x = 0 is a vertical 
asymptote. 


From steps 3 and 4, we know that 
cosecha => œo as x — 0”. 
Also, 
cosecha +0 as %— oo. 
This information enables us to sketch the graph. 


Yr 


y = cosech x 


365 


Unit A4 Real functions, graphs and conics 


Solution to Exercise A162 We have a = V3 and b = V2 so, 
2 
The parabola y? = 4az has focus (a,0), so the e=4] ges a 
parabola y? = 2x has focus (1/2, 0). 2 
2 
=a] ik 
Yr 3 
1- _ J5 v5 
T > 3 V3 
4) 3 i The eccentricity is /5/V/3. 
Solution to Exercise A165 


(a) We can complete the square in the equation 


Solution to Exercise A163 
We have a? = 3 and b? = 2, so a = V3 and b = V2. 


a? +y? —22-—6y+1=0 


i J to obtain 
The ellipse with equation EJ + T 1. (x —1)?-1+(y—3)?-94+1=0, 
YA that is, 


(z -1)} +(y—3)7 =9. 


V2 
3 So the equation represents a circle with centre 
-v3 vo 7 (1,3) and radius /9 = 3. 
—v2 


(b) If we complete the square in the equation 


z? +y +r+y+1=0, 


We have a = V3 and b = v2 so, i , 
we obtain the equation 


p2 
e=\/ l1- — 2 1) 2 1 
a? (ta) +(yt+5) =—3. 
-h 2 This equation does not represent a circle: there are 
T a no points satisfying it since its left-hand side is 
I 1 always non-negative whereas its right-hand side is 
=4/ = = —. negative. 
3 V8 ‘ : | 
The eccentricity is 1//3. (c) If we complete the square in the equation 


a? +y? — 2r +4y+5=0, 
Solution to Exercise A164 


we obtain the equation 


We have a? = 3 and b? = 2, so a = V3 and b = v2. (z -1)} + (y +2? =0. 
f wn y? 
The hyperbola with equation 3°97 1. Thus the equation represents the single point 
(1, -2). 
yt = NE (d) Here the coefficients of x? and y? are both 2, 
< so we divide the equation by 2 to give 
Qa £? +y? + ir- y- 3 =0. 


If we complete the square in this equation, we 
obtain the equation 


2 2 
(+3) -wtu -3) -6-350 


366 


that is, 
1)\2 3)2 _ 25 
Gt +(y-a) =#F- 
Thus the equation represents the circle with centre 
(—4, 3) and radius 2 = 5v2. 
Solution to Exercise A166 


(a) Two points on the line are (1, —1) and (0, —2) 
(obtained from t = 0 and t = —1, respectively). So 
the line is as follows. 

YA 

T 4 


We eliminate the parameter t by writing t = x — 1, 
so y = (a — 1) — 1; that is, the equation of the line 
isy=ax-2. 
(b) We eliminate the parameter t by writing 
t = 2/2 so y = 2(a/2) — 2; that is, y= x — 2. This 
shows that the points given by the parametrisation 
satisfy the equation of the line. 
Let (a,b) be any point on the line, so b = a — 2. 
Putting t = a/2 gives 

g(t) = g(a/2) = (2(a/2), 2(a/2) — 2) 

= (a,a — 2) = (a,b). 

Therefore, every point on the line is given by this 
parametrisation. 


Solution to Exercise A167 
(a) Setting (p,q) = (1,2) and (r,s) = (3,6) in 
a(t) = (p+ (r — p)t,q + (s — q)t) for t E R 
gives 
a(t) = (1+ 2t,2+ 4t), tER. 
(Other parametrisations are possible.) 


(b) t=5 t=3;t 


=l 
ae 


Solutions to exercises 


Solution to Exercise A168 


yit =r/2 
(0, 1) 
t = 37/4 
(—V2/2, 2/2) = 7/6 
_ (V3/2,1/2) 
(1,0) (1,0) T 
t = 5r/3 
t = 37/2 (1/2, —v3/2) 
(0,—1) 


Solution to Exercise A169 
(a) a(t) = (3 cost, 3 sint), for t € [0, 27]. 
(b) a(t) = (24 3cost,1+3sint), for t € [0, 27]. 


Solution to Exercise A170 
(a) a(t) = (St?,t), fort ER. 
(b) a(t) = (V3 cost, V2sint), for t € [0, 27]. 


(c) a(t) = (V3sect, V2 tant), for t € [—7, 7], 
excluding —7/2 and 7/2. 


367 


Unit A4 Real functions, graphs and conics 


Solution to Exercise A171 


The parametric equations for this curve are 
x=acosht, y= bsinht. 

We eliminate t by writing 
x/a=cosht, y/b=sinht 

and using the identity 


cosh? t — sinh? t = 1, 


to obtain 
2 2 
T 
—=- 5-1, 
a b2 


which is the equation for a hyperbola in standard 
form. 


Since cosh t is always positive, this parametrisation 
gives only one half of the hyperbola, namely the 
right-hand half corresponding to positive values of 
x (because cosht takes all values in [1,00)). Since 
sinht can be positive or negative, we get the whole 
of this right-hand half. 


YA 


368 


Acknowledgements 


Acknowledgements 


Grateful acknowledgement is made to the following sources. 


Cover image: © Mark Owen 


Unit Al 


René Descartes (Subsection 1.1): André Hatala [e.a.] (1997) De eeuw van 
Rembrandt, Bruxelles: Crédit communal de Belgique. This work is in the 
public domain in its country of origin and other countries and areas where 
the copyright term is the author’s life plus 100 years or less 


André Weil (Subsection 2.2): Taken from: 
www.xkimdir.com/meslek/matematikci.html 


Leonhard Euler (Section 3): Artist: Jakob Emanuel Handmann, image in 
the public domain 


Johann Bernoulli (Section 3): Artist: Johann Rudolf Huber. Image 
sourced: http://test.maa.org. Found on Wikipedia 


Unit A2 
Abraham de Moivre (Subsection 2.4): Taken from 


https: //famous-mathematicians.com/abraham-de-moivre/ 


Carl Friedrich Gauss (Subsection 2.4): Artist: Carl Friedrich Gauß, Ol auf 
Leinwand (Oil on Canvas), 1840. Image in the public domain 


Alexander Ostrowski (Subsection 2.4): Konrad Jacobs, MFO. This file is 
licensed under the Creative Commons Attribution-Share Alike Licence 
http://creativecommons.org/licenses/by-sa/3.0/ Germany 


369 


Acknowledgements 


370 


Unit A3 


Goldbach letter to Euler (Subsection 1.1): Letter from Christian Goldbach 
to Leonhard Euler 1742. This image is in the public domain 


Bourbaki group (Subsection 1.3): © Collection Privee/Archives 
Charmet/Bridgeman Images 


Giuseppe Peano (Subsection 1.5): Photographer unknown. This image is 
in the public domain 


Bertrand Russell (Subsection 1.5): Photographer unknown. This image is 
in the public domain 

Gerhard Gentzen (Subsection 1.5): Eckart Menzler-Trott / 

https: //commons.wikimedia.org/wiki/File:Gerhard_Gentzen.jpg This file is 
licensed under the Creative Commons 
Attribution-Noncommercial-ShareAlike Licence 

http: //creativecommons.org/licenses/by-sa/3.0/ 

Augustus De Morgan (Subsection 2.5): Sophia Elizabeth De Morgan. 
Image in the public domain 


G. H. Hardy (Subsection 3.1): Photographer unknown 


Unit A4 


Gerardus Mercator (Subsection 4.1): Engraver unknown. This image is in 
the public domain 


Apollonius (Subsection 5.1): Taken from www.maa.org. Image in the 
public domain 


Every effort has been made to contact copyright holders. If any have been 
inadvertently overlooked the publishers will be pleased to make the 
necessary arrangements at the first opportunity. 


Index 


absolute value 8, 290 
addition table for Z, 159 
additive identity 107, 108, 126, 127, 160 
additive inverse 107, 108, 126, 127, 160 
algebraically closed 143 
angle between vectors 75, 79 
Apollonius 341 
Argand diagram 119 
Argand, Monsieur 119 
Argz 128 
argument (of a complex number) 128 
arithmetic 

inC 126-128 

in R 106-108 

in Zn 157, 160, 161, 168-169 
associativity 107, 126, 160, 161 
asymptote 286 

horizontal 311 

of hyperbola 344 

vertical 311 
asymptotic behaviour of functions 310 
axis of a parabola 284, 342 


backwards substitution 164 
method of 165-166 
Bernoulli, Johann 35 
bijection (one-to-one correspondence) 50 
Bourbaki, Nicolas 199 


C 116 
Cartesian coordinates 4 
Cartesian form (of a complex number) 129, 149 
converting to 130, 150 
centre 
of a circle 9, 339 
of a hyperbola 344 
of an ellipse 343 
circle 339 
asaplane set 21 
centre 9, 339 
equation of 9-10, 342, 345 
parametric equations 351, 356 
radius 9 
unit 21 
closed interval 19-20 
closure 107, 126, 160, 161 


Index 


codomain of a function 35-36 
coefficients 
equating 112 
of a polynomial 109 
common factor 163 
commutativity 107, 126, 160, 161 
comparing coefficients 112 
completing the square 285, 306 
complex conjugate 121, 149 
properties of 122, 124 
complex conjugate pair 144 
complex exponential function 146 
complex numbers 116 
addition and subtraction 120 
additive identity 126, 127 
additive inverse 126, 127 
argument of 128 
arithmetic of 120, 126 
Cartesian form 129 
complex conjugate of 121 
dividing 124, 135, 136 
exponential form 148 
imaginary part 116 
modulus of 122 
multiplicative identity 127 
multiplicative inverse 127 
multiplying 120, 134, 136, 150 
polar form 128 
principal argument 128 
product 120, 134, 136, 150 
quotient 124, 135, 136, 150 
real part 116 
reciprocal of 124, 150 
roots of 137 
finding 139, 142 
complex plane 119 
complex variable 117 
components of a vector 67 
composite function 54, 327 
definitions 55, 57 
composite number 169 
conclusion of an implication 197 
congruence 
modulo an integer 152, 253, 263 
properties of 154 
modulo 27 265 


371 


Index 


conic (conic section) 338 
general equation of 347 
conjugate, complex 121 
conjunction 194 
constant function 283 
construction line 324 
contradiction, proof by 236 
contraposition, proof by 241 


contrapositive of an implication 202 


converse of an implication 202 
coordinate axes 

in R? 10 
coprime integers 163 
corollary 193 
cosech function 334 
cosh function 332 
coth function 334 
counterexample 225 
cubic equation 109 
cubic function 286 


de Moivre’s Theorem 138, 148 
de Moivre, Abraham 138 

De Morgan, Augustus 228 
decreasing function 307 
degenerate conic 339, 347 
degree of a polynomial 109, 301 
Descartes, René 4 

difference of two sets 33 
direct proof 209 

directrix 340 

disc 23 

disjoint sets 32, 256 
disjunction 195 

distance formula 


for C 123 
for R? 9 
for R? 13 


distance function 37 
distributivity 107, 127, 162 


division of complex numbers 124, 135, 136, 150 


Division Theorem 151, 199 
divisor 151 

domain of a function 35-36, 301 
dominant term of a polynomial 


372 


313 


dot product (scalar product) 75 


eccentricity 340 
of a hyperbola 344 
of an ellipse 343 
element of a set 13 
ellipse 338, 343 
equation of 343 
parametric equations 353, 356 
ellipsis 16 
empty set 19 
equality of sets 24, 27 
equality of vectors 63, 68 
equate corresponding coefficients 112 
equivalence 204 
equivalence class 255 
representative of 262 
equivalence relation 248 
Euclid 4, 237 
Euclid’s Algorithm 164, 240 
Euclidean space 
three-dimensional 12 
two-dimensional 4 
Euler’s Formula 147 
Euler’s Identity 147 
Euler, Leonhard 35, 106, 148, 192, 289 
even function 302 
exhaustion, proof by 210 
existential quantifier 207 
existential statement 207 
exponential form (of a complex number) 148, 
149 
converting to 150 
exponential function 289, 332 
exponential function, complex 146 


factor of a polynomial 111 
Factor Theorem 

in C 143 

in R 111, 222 
Fermat, Pierre de 228 
field 108, 126, 169 
finite decimal 106 
finite set 14 
First Derivative Test 309 
floor function 291 
floor of a real number 252 
focus 340 


function 35-36 

codomain of 35-36 

composite 54, 57, 327 

constant 283 

cubic 286 

decreasing 307 

distance 37 

domain of 35-36 

even 302 

exponential 289, 332 

hybrid 330 

hyperbolic 332 

identity 40 

image of a set under 40 

image of an element under 36 

image set of 42 

increasing 307 

integer part 290 

inverse of 47-50 

linear 283 

linear rational 286 

many-to-one 47 

modulus 290 

odd 302 

one-to-one 47 

onto 43 

periodic 287, 302 

polynomial 301 

quadratic 284 

rational 301 

real 37, 281 

reciprocal 286 

restriction of 54 

rule of 35-36 

trigonometric 287 

with domain a finite set 39 
Fundamental Theorem of Algebra 143 
Fundamental Theorem of Arithmetic 213 


Gauss, Carl Friedrich 148, 153 
general equation of a conic 347 
Gentzen, Gerhard 208 

Geometric Series Identity 210 
Goldbach’s Conjecture 191 
Goldbach, Christian 192 

gradient of a line 5 

graph-sketching strategy 314, 324 
greatest common divisor (GCD) 163 


half-open/half-closed interval 19-20 
half-plane 22 
Hardy, G. H. 237 
highest common factor (HCF) 163 
horizontal asymptote 311 
horizontal point of inflection 309 
hybrid function 330 
hyperbola 339, 344 
equation of 344 
parametric equations 354, 356 
rectangular 345 
hyperbolic function 332 
hypothesis of an implication 197 


identity function 40 
image 
of a set under a function 40 
of an element under a function 36 
image set of a function 42 
imaginary axis 119 
imaginary number 117 
imaginary part (of a complex number) 
implication 197 
conclusion of 197 
contrapositive of 202 
converse of 202 
hypothesis of 197 
increasing function 307 
increasing/decreasing criteria 308 
indirect proof 235 
induction, proof by 226 
infinite set 14 
injective (one-to-one) function 47 
integer part 252, 290 
integer part function 290 
integers 14, 104 
intercept 304 
x-intercept of a line 5 
y-intercept of a line 5 
intersection of sets 31 
interval 19-20 
closed 19-20 
half-closed 19-20 
half-open 19-20 
open 19-20 
inverse function 47, 50 
irrational numbers (irrationals) 105 


Index 


116, 117 


373 


Index 


Lagrange, Joseph-Louis 106 
Lambert, Johann Heinrich 106, 333 
Leibniz, Gottfried Wilhelm 35 
lemma 193 
line in R? 
as a plane set 21 
equation of 5-6 
gradient of 5 
vector equation of 73 
y-intercept of 5 
line in R?, vector equation of 73 
line, parametric equations of 348, 356 
linear equation 109 
linear equations in Z, 170, 172 
linear function 283 
linear rational function 286 
local maximum 308 
local minimum 308 


magnitude 
of areal number 8, 290 
of a vector 68 
of a vector, in terms of scalar product 76 
major axis of an ellipse 343 
many-to-one function 47 
mapping 37 
mathematical induction 227 
maximum of a function 309 
member of a set 13 
Mercator, Gerardus 333 
minimum of a function 309 
minor axis of an ellipse 343 
modular addition 157 
modular arithmetic 151, 157, 263 
modulo 27 266 
modular subtraction 161 
modulus 
in Zn 157 
of a complex number 122 
properties of 123, 124 
of acongruence 152 
of areal number 8, 290 
modulus function 290 
multiplication of complex numbers 120, 134, 
136, 150 
multiplication table for Zn 162 
multiplicative identity 107, 108, 127, 161 


374 


multiplicative inverse 107, 108, 127 
in Z, 162, 165, 168, 169, 240 
mutually perpendicular 10 


N 14, 104 

natural numbers 14, 104 

negation 193 

negative 108 

negative (function on an interval) 304 
non-degenerate conic 339 

normal (vector) 82 


odd function 302 

one-to-one correspondence 50 
one-to-one function 47 

onto function 43 

open interval 19-20 

ordered pair 4 

ordered triple 11 

origin of coordinate system 4, 10 
Ostrowski, Alexander 143 


parabola 284, 338 
equation of 342 
parametric equations 354, 356 

parallel lines 6, 252 

Parallelogram Law for vector addition 64 

parameter 348 

parametric equations 348 

parametrisation 
of acircle 351 
of a hyperbola 354 
of aline 348 
of a parabola 354 
of an ellipse 353 

partition of a set 256 

Pascal, Blaise 228 

Peano, Giuseppe 208 

periodic function 287, 302 
period of 287 

perpendicular lines 6 
gradients of 6 

perpendicular vectors 76 

plane 4 

plane in R? 81 
equation of 84 

plane set (plane figure) 21 

Poincaré, Henri 226 


point of inflection 309 
polar form (of a complex number) 128, 149 
converting to 131, 150 
polynomial 109 
coefficients of 109 
degree of 109 
factor of 111 
roots of 111, 137, 144 
with real coefficients 144 
zeros of 111, 137 
polynomial equation 109 
polynomial function 301 
coefficient of dominant term 313 
degree 301 
dominant term 313 
position vector 70 
positive (function on an interval) 304 
prime (number) 169 
principal argument (of a complex number) 128 
Principle of Mathematical Induction 227 
product of complex numbers 150 
proof 
by contradiction 236 
by contraposition 241 
by exhaustion 210 
by induction 226 
direct 209 
indirect 235 
proper subset 26 
proposition (statement) 191 
variable 192 
proposition (type of theorem) 193 


Q 14, 104 
quadratic equation 109 
quadratic formula 110 
quadratic function 284 
quantifier 
existential 207 
universal 206 
quotient 151 
of complex numbers 124, 135, 136, 150 


Index 


R 4, 14, 103 
R? 4,71 
R? 11, 71 
R* 14, 104 


range (image set) of a function 42 
rational function 301 
rational number 14 
rational numbers (rationals) 14, 104 
real axis 119 
real function 37, 281 
real line 4, 103 
real number 14 
real numbers (reals) 103-108 
real part (of a complex number) 116, 117 
real polynomial 111 
real variable 14 
reciprocal 108 
of a complex number 124, 125, 150 
reciprocal function 286 
rectangular hyperbola 345 
recurring decimal 106 
reflection 38 
reflexive property (reflexivity) 154, 248 
relation 246 
relatively prime integers 163 
remainder 151 
representative of an equivalence class 262 
restriction of a function 54 
right-handed system of axes 10 
root 
of a complex number 137, 142 
of a polynomial 111, 113, 137, 144, 222, 
231 
of unity 142 
rotation 38 
rule of a function 35-36 
Russell, Bertrand 208 


scalar 62 

scalar multiple of a vector 63, 68 

scalar product 75 
algebraic properties of 77 
and perpendicularity 76 
component form 78 

scaling of a graph 294 

sech function 334 

Second Derivative Test 323 


375 


Index 


set 13 

difference 33 

disjoint 32 

empty 19 

equality 24, 27 

finite 14 

infinite 14 

intersection 31 

notation 15-19 

plane 21 

proper subset of 26 

subset of 24 

union 29 
set of representatives 262 
singleton (set) 15 
sinh function 332 
solution set 18 
special angles (sines and cosines of) 129 
square root of a negative real number 117 
standard position 

of acircle 341 

of a hyperbola 341, 344 

of aparabola 341, 342 

of an ellipse 341, 343 
statement 191 

existential 207 

universal 206 
stationary point 308 
stretching see scaling 
strictly increasing/decreasing 307 
subset 24 

proper 26 
surjective (onto) function 43 
symmetric property (symmetry) 154, 248 


table of signs 305, 309 
tanh function 333 
terminating decimal 106 
theorem 193 
transformation of the plane 38 
reflection 38 
rotation 38 
translation 38 
transitive property (transitivity) 154, 248 
translation 38 
translation of a graph 293 
Triangle Law for vector addition 64 


376 


trigonometric function 287 
trigonometric identities 129, 134 


union of sets 29 


unit circle 
unit vector 


21 
66, 69 


universal quantifier 


universal statement 


206 
206 


variable proposition 192 


vector 62 


addition 64, 68 
algebra, properties of 65 
angle between 75, 79 


arithmetic 


64, 


68 


component form 67 
definition 62 
difference of 65 
equality 63, 68 
equation of a line 73 
magnitude of 62, 68, 76 
negative of 63, 68 
normal toa plane 82 
perpendicular 
position 70 
scalar multiple of 63, 68 
scalar product 
subtraction 65, 68 
unit 66, 69 
zero 62, 68 
Venn diagram 14 


vertex 


76 


79 


of a hyperbola 344 
of a parabola 284, 342 


of an ellipse 


vertical asymptote 


Weil, André 


19 


343 


311 


Whitehead, Alfred North 208 


Z 14,104 
Zn 157 


addition table for 159 
multiplication table for 162 


Zp, p prime 


169 


zero of a function 304 
zero of a polynomial 111, 137 


zero vector 


62, 68 


